One way to improve this situation is to let the nodes have a logical
clock on top of their hardware clocks. They could communicate their
logical clock values and adjust them based on the values of other
nodes in the system (according to some algorithm). The main practical
challenges here are that the hardware clock drift rate and the latency
of the communications between nodes is variable and unknown to these
nodes. At best, the nodes may be aware of some upper bounds on the
hardware clock drift rate and message latencies ($\tau$). Further, an
algorithm that adjusts logical clock values can be expected to have
certain reasonable properties:

- The largest skew between any two logical clocks in the network (global skew) is small.
- The largest skew between the logical clocks of any pair of adjacent nodes in the network (local skew) is small.
- The maximum possible skew between any pair of nodes degrades smoothly as a function of the distance between the nodes (gradient property).
- The rate of progress of logical clock is neither too small nor too large. This is to ensure that the spacing between logical times given to events at one node compares well with their real times.

Evidently, the skew depends on the size of network, and more
specifically, on the diameter $D$ of the network. It has been shown
that no algorithm can achieve a better bound on global skew than $D/2$
(Assume $\tau = 1$) [Biaz, Welch '01]. More surprisingly, [Fan, Lynch
'04] showed that no algorithm could achieve a local skew better than
$\Omega(\log D/ \log\log D)$. While algorithms that match the bound on
global skews were known, matching the local skew bound has been
elusive. The paper we will be discussing [LLW-PODC'09, J.ACM'10]
solves this problem, and also presents an improved lower bound for
local skew.

In the discussion, we will see a more formal statement of the model.
We will discuss why certain simple algorithms like setting a node's
logical clock to the average or maximum of its neighbors fail. We will
then see how ideas from these failed algorithms can be put together to
construct a simple algorithm (the proofs, however, are a bit long)
that has optimal global and local skews and a good gradient
property. We will also talk about lower bounds on global and (time
permitting) local skews.

Note: In case you want to read the paper, [LLW-PODC'09] contains only
the statements of the results. Also, I found the two associated tech
reports diffcult to follow. Their recent journal article in
J.ACM-Jan'10 is much more readable and contains proofs in good detail.
Locher and Wattenhofer's earlier paper in DISC'06 is a good background
reading and helps you appreciate the algorithm better.

## No comments:

## Post a Comment