- The largest skew between any two logical clocks in the network (global skew) is small.
- The largest skew between the logical clocks of any pair of adjacent nodes in the network (local skew) is small.
- The maximum possible skew between any pair of nodes degrades smoothly as a function of the distance between the nodes (gradient property).
- The rate of progress of logical clock is neither too small nor too large. This is to ensure that the spacing between logical times given to events at one node compares well with their real times.

## Monday, April 26, 2010

### Tight bounds for Clock Synchronization [Lenzen, Locher, Wattenhofer PODC '09]

## Thursday, April 22, 2010

### Forgotten hoodie

- Ryan

## Tuesday, April 20, 2010

### The intersection of two halfspaces has high threshold degree

Such questions concerning approximation of functions by polynomials have been studied by Math and CS researchers since the $1960$'s. Some of the earliest papers on these topics are by Newman and Minsky and Papert. Although interesting in its own right, this line of work has found numerous applications in various areas of CS theory such as communication complexity, quantum computing, complexity theory etc. For more details, see this excellent tutorial by Scott Aaronson. Sign representation by polynomials also has a very nice implication for machine learning. Roughly speaking, if a function has sign degree $d$, then it can be learned in time $n^{O(d)}$ under a standard model of learning. This connection has been exploited heavily in recent years to come up with learning algorithms for many non-trivial problems. Such algorithms are sometimes also called ``perceptron based algorithms''.

However, one notorious class of functions for which no efficient learning algorithm is known till date is the class of intersection of two halfspaces. A halfspace over $\{-1,1\}^n$ is a Boolean function defined as $f(x) = sgn(a_0 + \sum_i a_ix_i)$. The intersection of two halfspaces is a function $h(x,y) = f(x) \land g(y)$, where both $f$ and $g$ are halfspaces over $\{-1,1\}^n$. Clearly, by definition the sign degree of a halfspace is $1$. Minsky and Papert in 1969, showed that the sign degree of the intersection of two halfspace is $\omega(1)$. In their paper they also introduced a very important technique, called ``symmetrization'', which can be used to lower bound the sign degree of certain functions. The next non-trivial improvement came in 2003, when O'Donnell and Servedio showed that the sign degree of the intersection of two halfspaces is $\Omega(\frac {\log n} {\log \log n})$. Although a massive improvement, this still does not rule out sub-exponential time ``perceptron based'' algorithms for learning the intersection of two halfspaces.

Last year in an award winning paper presented at FOCS, Alexander Sherstov showed that this bound is $\Omega(\sqrt{n})$. Also, in a very recent follow-up work he has managed to show that the sign degree of the intersection of two halfspaces is in fact $\Theta(n)$. Hence, perceptron based approaches cannot be used to get any non-trivial algorithm for this class of functions. This in turn means that some new ideas will be required to make progress on the problem of learning the intersection of two halfspaces. On Thursday, we will prove Sherstov's result from FOCS'09 which shows that the sign degree of the intersection of two halfspaces is $\Omega(\sqrt{n})$. The talk will be self-contained.

## Tuesday, April 13, 2010

## Tuesday, April 6, 2010

### Poly-logarithmic independence fools $AC^0$ circuits

An $AC^0$ circuit is a Boolean circuit containing NOT, AND, OR gates. We count the number of AND, OR gates as the \emph{size} of the circuit. While the \emph{depth} of a circuit is defined as the maximum number of AND, OR gates on any path from an input to the output.

We say a distribution $\mu$ fools a circuit $f$, if $|\mathbf{E}_{x \sim \mu}[f(x)] - \mathbf{E}_{x}[f(x)]| < \epsilon$. I.e., the expected output of $f$ on $\mu$ is ``roughly'' the same as that on uniform distribution.

The problem considered in this paper is on the power of $r$-independence to fool $AC^0$ circuits. Given $\epsilon$ as a constant, how large has $r$ to be so that any $r$-wise independent distribution fools all the $AC^0$ circuit of size $m$ and a constant depth $d$? The main theorem in the paper shows that $r$ is enough to be poly-logarithmic with $m$. In fact, the author proves that $r = (\log (m / \epsilon))^{O(d^2)}$ suffices. And this resolves a conjecture by [Linial-Nisan'90], which conjectured $r = O((\log m))^{d - 1}$ is enough, up to a gap between $O(d)$ and $O(d^2)$ in the exponent. The only prior progress on this problem was by [Bazzi'07], which showed that $O(\log^2 m)$-wise independent distributions fool depth-2 circuits (DNF/CNF).

The (very rough) approach to prove is to find a low degree polynomial $g$ which approximates the expectation of $f$ under both uniform distribution and a given $r$-wise independent distribution. This $g$ would set up a bridge from $\mathbf{E}_{\mu}[f]$ to $\mathbf{E}[f]$ by $\mathbf{E}[f] \approx \mathbf{E}[g] \approx \mathbf{E}_{\mu}[g] \approx \mathbf{E}_{\mu}[f]$. But since this $g$ seems hard to find (maybe it's hard because it needs to be specific for a specific distribution), the proof in the paper actually approximates $f$ by low-degree functions $g$ in the following two senses.

(1) [Beigel-Reingold-Spielman'91, Tarui'93] Find low-degree (but unbounded) $g$ so that $\mathbf{Pr}[f \neq g] < \epsilon, \mathbf{Pr} _{\mu}[f \neq g] < \epsilon$.

(2) [Linial-Mansour-Nisan'93] Find low-degree $g$ so that $\| f - g \|_2^2 < \epsilon$.

The proof succeeds by the two results above as building blocks and some other clever observations, all of which we will go through in the class.

The proof of [Linial-Mansour-Nisan'93] theorem also uses Fourier analytic tools. But no background on these will be needed -- I will introduce them in the class.

## Monday, March 29, 2010

### Fully Homomorphic Encryption Scheme

Homomorphic encryption is a form of encryption where a specific algebraic operation performed on the ciphertext has a regular algebraic effect on the plain text. Let us take a look at the RSA system. Its public key consists a pair of integers $(n,e)$. The encryption function of a message $m$ is by calculating $m^e (mod n)$. Suppose that for message $m_1$ and $m_2$, we have their encryption $c_1,c_2$. We know that $(m_1*m_2)^e = (m_1)^e * (m_2)^e = c1*c2 (mod n)$. Therefore, the product of two ciphertexts is corresponding to the encryption of the product of the plaintexts. RSA is therefore known to be homomorphic with respect to the multiplication operation. However, the homomorphism is not preserved for addition: adding $c_1$ and $c_2$ is not the encryption of $m_1+m_2$ (or other algebraic operations on $m_1$ and $m_2$).

Since basically every computation on the integer ring can be done by a circuit with addition and multiplication gate, people would like to see an encryption scheme that is homomorphic for both addition and multiplication (on the plain text). This allows us to compute the encryption of arbitrary function on the plain texts by only looking at the ciphertexts. A encryption scheme that is homomorphic with respect to both addition and multiplication is called "fully homomorphic". Prior to this work, it is only known how to construct the encryption scheme that is homomorphic for a single operation (either addition or multiplication) or for "simple" circuits of both addition and multiplication gates. One of the famous early work is due to Boneh-Goh-Nissim which supports function of unlimited addition but only a single multiplication.

In his breakthrough paper in STOC 2010, Gentry constructed a encryption scheme that is fully homomorphic. Its security assumption is based on solving certain lattice problem under worst case. There are mostly two key ideas for constructing such a homomorphic encryption scheme. The first idea is that if the encryption scheme can ensure that it is homomorphic with respect to its own decryption circuit, then we can somehow bootstrap the encryption scheme to make it fully homomorphic. Given the first idea, it remains to construct such a encryption that is bootstrapable. This will require the decryption function to be simple enough. Gentry has some smart ideas of "squashing" the circuit complexity of the decryption function.

As the encryption scheme of the original paper is based on ideal lattice which is quite complex algebraic object, we will instead present a followup work (by Gentry along with Marten van Dijk, Shai Halevi, and Vinod Vaikuntanathan) which simplify the first paper. The link of the new paper is

http://eprint.iacr.org/2009/616.

In this new paper, all the encryption and decryption is implemented by addition and multiplication over integers (instead of using the ideal of the lattice). The two key ideas (bootstrapping and squashing) are still preserved in the new paper.

There are many applications for homomorphic encryption schemes. For example, we can use such an encryption scheme for voting without releasing the privacy of each voter's preference. Basically, every voter encrypts their preference and the voting system executes the voting rules (for e.g., evaluate the majority function) only on the ciphertext. Homomorphic encryption scheme is also believed to have wide applications in the context of cloud computing.

However, the current scheme is still quite impractical. To encrypt one bit, it requires $\eta^8$ bits (I think) where $\eta$ is the security parameter. If we take $\eta$ to be 1000, then it means that to encrypt one bit, we need $10^{24}$ bits. Also, there is a huge loss of efficiency for converting a general algorithm into a circuit of the multiplication and addition gates.

As for the class, I suggest you have a look at the paper at:

http://eprint.iacr.org/2009/616

No other crypto knowledge is needed for this talk.

## Monday, March 22, 2010

### Intrinstic Robustness of the Price of Anarchy

Everybody loves cookies. Even us, the students in 15-589U class, love cookies. Thus, for Thursday's class, Ryan, who wishes us all to attend the talk, will bring cookies. Alas, we are 12 students in the class, and Ryan only has 11 cookies. So Ryan is willing to give us the cookies if we all reach a consensus on who gets a cookie. Splitting a cookie is unheard of, and neither is sharing a cookie. (Yes, we all love cookies that much!)

So essentially, each of us has two options: "ask for a cookie" or "not to ask for a cookie". If everybody asks for a cookie, no one gets a cookie, as there as only 11 cookies and 12 students. If 11 or fewer people ask for a cookie, each gets 1 cookie. If you don't ask for a cookie, you always get no cookies. So, how will we reach a consensus?

Well, that depends on how collegial we are. If we care about getting as many cookies as possible as a class, someone steps down, and the rest get 11 cookies. But what if each of us cares only about how many cookies (s)he gets? It is clear that no matter what other students do, each of us is always better off by asking for a cookie! Therefore, we all ask for cookies, and no one gets any! So, even though as a collective, we could gain 11 cookies, we end up getting 0, and everybody's passion for cookies is left unsatisfied.

Let's contemplate a bit about this example. Here is a game, where acting selfishly is globally worse than playing for overall good. In fact, we get a ratio of 11/0, which is infinitely worse. What about other games? Can we measure how much worse will we do by acting selfishly in a given game G? In a set G of games?

This is precisely the question that Koutsoupias and Papadimitriu raised in STOC'99. Suppose we model the game by costs rather than payoffs, so everybody wishes to minimize their costs. How much more do we (altogether) pay for playing selfishly, in comparison to playing for the overall good? In the words of Koutsoupias and Papadimitriu, who considered routing games, how much faster would the internet be if it had a manager? This is the Price of Anarchy, the ratio between the worst cost of a Nash-Equilibrium in the game, to the cost of the social optimum.

Ever since Koutsoupias and Papadimitriu raised the question, there have been numerous attempts at bounding the POA of various classes of games. The question was even extended to other models of players (such as regret-minimizing, myopic best response players etc.). In STOC 2009, Tim Roughgarden introduced the notion of smoothness in games, and showed the following:

(i) Smoothness gives an immediate (and very elegant) upper bound on the POA of a game.

(ii) The same bound applies to other models of players.

(iii) This bound gives the exact POA for the class of congestion games.

In Thursday's talk, we will survey the definition of POA and the various models of players considered in the past. We will then go on to stating in exact and proving all the above 3 results.

As a side note: the first two results are pretty straight-forward, so even people who have no background in Game Theory are welcome to attend the talk.

And just to clarify: Ryan will NOT actually bring cookies to Thursday's class.

## Monday, March 15, 2010

This week I will present a paper from FOCS'09 by Dinur and Harsha on Probabilistically-Checkable Proofs (PCP), "Composition of low-error 2-query PCPs using decodable PCPs" (link). PCP Theorem [Arora-Safra'98] shows that any mathematical proof can be checked probabilistically by a verifier using only small number of queries. If the theorem is true, then there should exists a proof which the verifier accepts always (completeness). If the theorem is false, verifier should accept a proof with at most bounded probability (soundness).

The ground breaking result from [Arora-Safra'98, and Arora-Lund-Motwani+'98] states that any language in NP has a PCP with constant number of queries for which the soundness is bounded by a constant. To appreciate this result, notice that the proof length is essentially unbounded. In this regard, the proof can be seen as a code with very large distance -- one that can detect very small number of changes. A very good survey on the history of PCP theorem and a combinatorial proof of the original result, due to Dinur, can be found in [link].

Formally, PCP verifier V for a language L is a polynomial time algorithm that, given input string x \in {0,1}^n, and oracle access to a proof pi \in Sigma^m(n), reads x, and using r many random bits, computes q indices to query (local window) Ir = {I1, ..., Iq} and a local predicate fr:Sigma^q to {0,1}. Then it accepts iff fr(pi(Ir)) = 1.

- (Completeness) If x \in L, then there exists a proof pi which is accepted with probability 1.
- (Soundness) If x \notin L, then for any proof pi, Pr[fr(pi(Ir))=1] <= epsilon.

We denote the languages accepted by such PCP verifiers with PCP{1, epsilon} [r,q]_Sigma. Notice that (trivially) PCP{1,epsilon}[O(log n), q]_Sigma \subset NP. However the existence of proof systems with NP \subset PCP made it possible to get hardness of approximation results, which started with the inapproximability of clique/independent set. Even problems with unusual approximation ratios (Asymmetric k-Center problem with O(log*n)-approximation) were shown to be optimal, in other words any slight improvement in the approximation ratio will lead to NP=P.

A central problem in this area is the construction of PCPs with smaller parameters, most notably sub-constant soundness with 2-queries and small alphabet. Although interesting in its own right, the existence of such PCPs provide even stronger inapproximability results. For example, Hastad'01 showed that Max-3SAT problem is NP-hard to approximate beyond 7/8 + epsilon, for any constant epsilon > 0. However this result does not rule out, say, the possibility of having an approximation ratio of 7/8 + 1/loglog n.

It has been known for quite some time how to get sub-constant soundness [Raz-Safra'97, Arora-Sudan'03, Goldreich-Sudan'06] with exactly 2-queries. However these constructions tended to yield proofs with exponentially large alphabets in terms of soundness. Although this may not sound too bad (after all alphabet sizes are still at most polynomial in n), the hardness results take time exponential (or doubly exponential) in alphabet size. Thus these PCPs make it impossible to get hardness results based on P != NP (at best one gets DTIME(n^log n) != NP).

A recent paper by Moshkowitz and Raz in FOCS'08 demonstrated that indeed it is possible to get sub-constant soundness with small alphabet (and proof). A major open question in their paper was:

- Does there exist a shorter/easier proof for this result? (As the paper happened to be 100+ pages.)

The paper we will discuss answers this question affirmatively. To give an overview: The technique involves reducing the alphabet size considerably through a technique called proof composition using list-decodable proofs [otherwise alphabet reductions with unique decoding are not possible]. The decoding part is performed by another PCP called [no surpise there] decodable PCP, D. Most important part of this paper is a way to compose V and D without increasing number of queries. After composition step, the resulting proof is made nicer by applying local transformations like degree reduction using expanders on the corresponding Label-Cover problem, which is defined as:

An instance of Label-Cover problem is specified with: a bipartite graph G=(L \cup R, E), alphabets SigmaL, SigmaR, set of functions (projections) for each edge F{uv} : SigmaL to SigmaR with u\in L, v\in R. A labelling function over nodes, G:L\cup R \to SigmaL \cup SigmaR is said to satisfy an edge (u,v)\in E if F{uv}(G(u)) = G(v).

----

Although none of the proofs I will present are complicated or long, the presentation will be a little bit dense as we will constantly switch back and forth between different definitions. Hence I strongly suggest looking at (assuming non-familiarity):

- Expanders (definition and expander mixing lemma)
- Basic definitions from coding theory (relative distance, rate)
- Label Cover, PCPs (basically becoming familiar with the definitions)

## Thursday, March 4, 2010

## Monday, February 22, 2010

### An O(log n / loglog n)-approximation algorithm for the Asymmetric Travelling Salesperson problem

## Monday, February 15, 2010

## Monday, February 8, 2010

### Twice Ramanujan Sparsifiers

Given a weighted graph G, a natural question to ask is whether there exist another graph H on the same vertex set such that any cut has similar weights in G and H. As the weight of a cut can be described by $\sum_{uv \in E(G)}w_{uv}(x_u-x_v)^2$ where x is a 0-1 vector, one could ask the same question where x can take any real values. A spectral sparsifier for a graph G is H such that the above functions on the two graphs are close for any assignment of x and H is sparse.

Algorithmically, graph sparsifiers were first studied as a subroutine for near-linear time Laplacian Solvers by Spielman and Teng. Their algorithm partitions the graph into expanders by finding approximately sparse cuts, then randomly sampling each expander.

Follow up work has been done on other methods for finding sparsifiers that give better guarantees. One approach is to sample the edges by effective resistance, but it relies on the near-linear time solver to achieve good performance. This work uses a similar approach, but with a different goal in mind. It aims to find the best approximation rate possible with a d-regular subgraph, and achieves twice the ratio given by Ramanujan graphs , which are regular graphs whose spectral gap are as large as possible.

I will use the first half of the talk to cover background topics such as condition number and spectral decompositions, then show the approaches used in recent work on sparsifiers. If time permit, I will try to give a sketch of why one couldn't do better than the ratios given by Ramanujan graphs. The second half of the talk will be used to go through the proof of the key theorem of this paper, which basically states the following:

Given a set of rank one matrices that sum to identity, one could pick a small subset of them such that the ratio between the largest and smallest eigenvalues of their sum is small.

## Tuesday, February 2, 2010

### The Role of Randomization in Mechanism Design

*social welfare*: the sum utilities of all of the bidders. In addition to being able to distribute the goods, the mechanism is given the power to charge bidders some price, up to their reported valuation for the set of goods they ultimately receive.

*optimal*allocation to maximize social welfare, and so when the welfare maximization problem is itself computationally difficult, it is not possible to implement the VCG mechanism. Unfortunately, the incentive properties of the VCG mechanism depend on the output allocation actually being optimal: this means that approximation algorithms for combinatorial optimization problems don't easily translate into truthful mechanisms which approximately maximize social welfare.

- A
*universally truthful*randomized mechanism is simply a probability distribution over deterministic truthful mechanisms. The mechanism gets to flip some coins, but no matter what the outcome of the coins, no agent could have benefited by lying,*even in hindsight*. - A
*truthful in expectation*mechanism only guarantees that no agent can benefit*in expectation*by lying, before the mechanism's coins are flipped. For some outcomes of the coins in hindsight, agents could possibly have benefited by lying.

*risk neutral*and only wish to maximize their expected welfare, independent of risk. But when we do have risk neutral bidders, can we provide better approximations for some problems using truthful in expectation mechanisms?

- For any epsilon > 0, there exists a truthful in expectation mechanism which achieves a (1+epsilon) approximation to social welfare and runs in time poly(1/epsilon, log m).
- For any constant epsilon > 0 any universally truthful mechanism that achieves a (2-epsilon) approximation to social welfare must run in time Omega(m).

## Sunday, January 24, 2010

### Recent constructive results for the Lovász Local Lemma

The Lovász Local Lemma is a workhorse in probabilistic combinatorics which has found applications to packet routing, perfect hashing, graph coloring, scheduling, statistical mechanics, etc. It is typically used in conjunction with the probabilistic method to prove the existence of solutions. Efficiently finding these solutions has been the pursuit of a good deal of research, which chipped away at the problem through special cases.

At STOC 2009, Robin Moser resolved an important special case - bounded occurrence k-SAT - with a ZPP algorithm and a weak derandomization. This was an important result in of itself, earning the Best Paper award, but what really tickled the theory community was the marvelous proof strategy he came up with for his talk; "one of the best STOC talks ever", says Uncle Lance. A few months later, he and Gábor Tardos generalized the result to cover (essentially) the entire LLL. Karthekeyan Chandrasekaran, Navin Goyal, and Bernhard Haeupler strengthened the derandomization under a mild "ϵ-slack" condition, thus placing more non-constructive solutions squarely in P. A couple weeks ago, Bernhard Haeupler, Barna Saha, and Aravind Srinivasan showed that the Moser-Tardos algorithm retains a beneficial property of the naive rejection sampling algorithm, which can lead to polynomial running time even with superpolynomial events.

In my talk, we'll touch on each of these results, though much more emphasis will be placed on those by Moser and Tardos. Here's an outline.

- (5-10 min) Introduction.

- (10-15 min) Proof of symmetric k-SAT LLL from this survey article.

- (10 min) A motivational application to hypergraph coloring from Rajmohan Rajaraman's notes.

- (5 min) A brief history of results inching towards a constructive proof.

- (25-35 min) The Moser-Tardos algorithm for symmetric, bounded occurrence k-SAT and the slick STOC proof as explicated by Fortnow and Tao. (The latter post is
*excellent*.)

- (25-30 min) The Moser-Tardos algorithm for symmetric LLL and its proof from Joel Spencer's notes. This proof is more pedestrian but also more general, and it sets up the rest of the talk.

- (10 min) Break.

- (15-20 min) Derandomization results (briefly).

- (10-15 min) Conditional LLL distributions (briefly).

You may wish to review Shannon entropy and derandomization by conditional probability.

## Tuesday, January 12, 2010

### Read this blog

This is a seminar course. The instructor is Ryan O'Donnell. The course homepage is http://www.cs.cmu.edu/~odonnell/hits09/, but that page will be mostly static. Please read this blog regularly, or add it to your RSS feed.

We will use MathML in these posts; if it is working correctly for you, the following statements should look basically the same:

*L*is "laminar" if for all

*A*,

*B*∈

*L*, either

*A*∩

*B*= Ø,

*A*⊆

*B*, or

*B*⊆

*A*.

http://www.dessci.com/en/products/mathplayer/download.htm

If you use Firefox, try going here:

http://www.mozilla.org/projects/mathml/fonts

For Chrome and Safari, I'm not sure; if someone wants to comment on whether or not it works for them, please do.