## Monday, March 15, 2010

This week I will present a paper from FOCS'09 by Dinur and Harsha on Probabilistically-Checkable Proofs (PCP), "Composition of low-error 2-query PCPs using decodable PCPs" (link). PCP Theorem [Arora-Safra'98] shows that any mathematical proof can be checked probabilistically by a verifier using only small number of queries. If the theorem is true, then there should exists a proof which the verifier accepts always (completeness). If the theorem is false, verifier should accept a proof with at most bounded probability (soundness).

The ground breaking result from [Arora-Safra'98, and Arora-Lund-Motwani+'98] states that any language in NP has a PCP with constant number of queries for which the soundness is bounded by a constant. To appreciate this result, notice that the proof length is essentially unbounded. In this regard, the proof can be seen as a code with very large distance -- one that can detect very small number of changes. A very good survey on the history of PCP theorem and a combinatorial proof of the original result, due to Dinur, can be found in [link].

Formally, PCP verifier V for a language L is a polynomial time algorithm that, given input string x \in {0,1}^n, and oracle access to a proof pi \in Sigma^m(n), reads x, and using r many random bits, computes q indices to query (local window) Ir = {I1, ..., Iq} and a local predicate fr:Sigma^q to {0,1}. Then it accepts iff fr(pi(Ir)) = 1.

• (Completeness) If x \in L, then there exists a proof pi which is accepted with probability 1.
• (Soundness) If x \notin L, then for any proof pi, Pr[fr(pi(Ir))=1] <= epsilon.

We denote the languages accepted by such PCP verifiers with PCP{1, epsilon} [r,q]_Sigma. Notice that (trivially) PCP{1,epsilon}[O(log n), q]_Sigma \subset NP. However the existence of proof systems with NP \subset PCP made it possible to get hardness of approximation results, which started with the inapproximability of clique/independent set. Even problems with unusual approximation ratios (Asymmetric k-Center problem with O(log*n)-approximation) were shown to be optimal, in other words any slight improvement in the approximation ratio will lead to NP=P.

A central problem in this area is the construction of PCPs with smaller parameters, most notably sub-constant soundness with 2-queries and small alphabet. Although interesting in its own right, the existence of such PCPs provide even stronger inapproximability results. For example, Hastad'01 showed that Max-3SAT problem is NP-hard to approximate beyond 7/8 + epsilon, for any constant epsilon > 0. However this result does not rule out, say, the possibility of having an approximation ratio of 7/8 + 1/loglog n.

It has been known for quite some time how to get sub-constant soundness [Raz-Safra'97, Arora-Sudan'03, Goldreich-Sudan'06] with exactly 2-queries. However these constructions tended to yield proofs with exponentially large alphabets in terms of soundness. Although this may not sound too bad (after all alphabet sizes are still at most polynomial in n), the hardness results take time exponential (or doubly exponential) in alphabet size. Thus these PCPs make it impossible to get hardness results based on P != NP (at best one gets DTIME(n^log n) != NP).

A recent paper by Moshkowitz and Raz in FOCS'08 demonstrated that indeed it is possible to get sub-constant soundness with small alphabet (and proof). A major open question in their paper was:

• Does there exist a shorter/easier proof for this result? (As the paper happened to be 100+ pages.)

The paper we will discuss answers this question affirmatively. To give an overview: The technique involves reducing the alphabet size considerably through a technique called proof composition using list-decodable proofs [otherwise alphabet reductions with unique decoding are not possible]. The decoding part is performed by another PCP called [no surpise there] decodable PCP, D. Most important part of this paper is a way to compose V and D without increasing number of queries. After composition step, the resulting proof is made nicer by applying local transformations like degree reduction using expanders on the corresponding Label-Cover problem, which is defined as:

An instance of Label-Cover problem is specified with: a bipartite graph G=(L \cup R, E), alphabets SigmaL, SigmaR, set of functions (projections) for each edge F{uv} : SigmaL to SigmaR with u\in L, v\in R. A labelling function over nodes, G:L\cup R \to SigmaL \cup SigmaR is said to satisfy an edge (u,v)\in E if F{uv}(G(u)) = G(v).

----

Although none of the proofs I will present are complicated or long, the presentation will be a little bit dense as we will constantly switch back and forth between different definitions. Hence I strongly suggest looking at (assuming non-familiarity):

• Expanders (definition and expander mixing lemma)
• Basic definitions from coding theory (relative distance, rate)
• Label Cover, PCPs (basically becoming familiar with the definitions)