Mathematics of the impossible, Chapter 12, Data structures

And as promised, here’s a chapter on data structures… in a complexity theory course!

Data structures aim to maintain data in memory so as to be able to support various operations, such as answering queries about the data, and updating the data. The study of data structures is fundamental and extensive. We distinguish and study in turn two types of data structure problems: static and dynamic. In the former the input is given once and cannot modified by the queries. In the latter queries can modify the input; this includes classical problems such as supporting insert, search, and delete of keys.

12.1 Static data structures

Definition 12.1. A static data-structure problem is simply a function f:\{0,1\}^n \to [q]^{m}. A data structure for f with space s, word size w and time t is a way to write f as

\begin{aligned} f(x)=h(g(x)) \end{aligned}

where g:\{0,1\} ^{n}\to [2^{w}]^{s}, h:[2^{w}]^{s}\to [q]^{m}, and each output bit of h depends on \le t input words (we think of s as divided into words of length w).

Here we have n bits of input data about which we would like to answer m queries. Often the queries and or the word size are boolean, i.e., 2^{w}=q=2. Another typical setting is q=2^{w}. The data structure aims to accomplish this by storing the input into s bits of memory. This map is arbitrary, with no bound on resources. But after that, each query can be answered very fast, by reading only t words. In general, these words can be read adaptively. But for simplicity we focus on the case in which the locations are fixed by the data structure and the same for every input x\in \{0,1\} ^{n}.

Exercise 12.1. Consider the data structure problem f:\{0,1\} ^{n}\to \{0,1\} ^{m} where m=n^{2} and query (i,j)\in \{1,2,\ldots ,n\}^{2} is the parity of the input bits from i to j.

Give a data structure for this problem with s=n, w=1, and t=2.

Exercise 12.2. Show that any data-structure problem f:\{0,1\} ^{n}\to \{0,1\} ^{m} has a data structure with w=1 and the following parameters:

(1) s=m and t=1, and

(2) s=n and t=n.

Exercise 12.3. Prove that for every n and m\le 2^{n/2} there exist functions f:\{0,1\} ^{n}\to \{0,1\} ^{m} s.t. any data structure with space s=m/2 and w=1 requires time t\ge n-c.

By contrast, next we present the the best known impossibility result.

Definition 12.2. A function f:[2]^{n}\to [q]^{m} is d-wise uniform if any d output coordinates are uniform when the input to f is uniform.

Theorem 12.1. [65] Let f:[q]^{d}\to [q]^{q} be d-wise uniform. Let q be a power of 2 and c\log q\le d\le q^{c}. Then any data structure with w=\log q using space s (which recall is measured in words of w bits) and time t has:

\begin{aligned} t\geq c\frac {\log q}{\log (s/d)}. \end{aligned}

Interpreting the input as coefficients of a degree d-1 univariate polynomial over \mathbb {F}_{q} and outputting its evaluations shows that such functions exists, and are in P. Below we give a surprising data structure that nearly matches the theorem.

To match previous parameters note that n=d\log q=dw, and m=\log q. Hence the bound is t\ge c(\log m)/\log (sw/n). Note that sw is the space of the data structure measured in bits. It follows that if sw is linear in n then t\ge c\log m. This result remains non-trivial for s slightly super-linear. But remarkably, if sw=n^{1+c} then nothing is known (for m power in n one only gets t\ge c).

ProofThe idea in the proof is to find a subset S of less than d memory cells that still allows us to answer \ge d queries. This is impossible, since we can’t generate d uniform outputs from less than d memory cells.

Let p:=1/q^{1/4t}. Include each memory bit in S with probability p, independently. By Theorem 2.7, \mathbb {P}[|S|\ge cps]\le 2^{-cps}.

We are still able to answer a query if all of its memory bits fall in S. The probability that this happens is p^{t}=1/q^{1/4}. We now claim that with probability \ge 1/q^{c}, we can still answer \sqrt {q} queries. Indeed, let B be the number of queries we cannot answer. We have \mathbb {E}[|B|]\le q(1-q^{1/4}). And so

\begin{aligned} \mathbb {P}[B\ge q(1-1/\sqrt {q})]\le \frac {1-q^{1/4}}{1-\sqrt {q}}\le 1-q^{c}. \end{aligned}

Thus, if the inequality 2^{-cps}\leq 1/q^{c} holds then there exists a set S of cps bits with which we can answer \ge \sqrt {q}>d queries. Hence we reach a contradiction if

\begin{aligned} c\log q\le cps<d. \end{aligned}="" Because d>c\log q by assumption, and increasing s only make the problem easier, we reach a contradiction if cps<d, and="" the="" result="" follows.="" QED

Next we show a conceptually simple data structure which nearly matches the lower bound. For simplicity we focus on data structures which use space q^{\epsilon } – recall in this case the previous result does not give anything. We will show this is for good reasons, there are data structures where the time is constant. We will only show c_{\epsilon }d-wise independence, as opposed to d-wise, but the proof techniques next and above generalize to other settings of parameters.

Theorem 12.2. There is a map f:[q]^{d}\to [q]^{q} which is c_{\epsilon }d-wise uniform and has a data structure with w=\log q space s=dq^{\epsilon } and time c_{\epsilon }, for any \epsilon and q which is a power of 2.

To give a sense of the parameters, let for example q=d^{10}.

ProofWe fill the memory with s evaluations of the input polynomial. Then we pick a random bipartite graph with s nodes on the left and q nodes on the right. Every node on the right side has degree g. We answer each query by summing the corresponding cells in s. Let d':=d/g. To show d'-wise uniformity it suffices to show that for any subset R\subseteq [q] on the right-hand side of size d', the sum of the corresponding memory cells is uniform in \mathbb {F}_{q}. For this in turn it suffices that R has a unique neighbor. And for that, finally, it suffices that R has a neighborhood of size greater than \frac {g|R|}{2} (because if every element in the neighborhood of R has two neighbors in R then R has a neighborhood of size less than g|r| 2).

Note here we are using that the neighborhood has size \le gd'=d, and so the memory is d-wise uniform.

We pick the graph at random and show that it has the latter property with non-zero probability.
[Standard calculations follow that wordpress has trouble displaying… wait for the book draft, I guess.]
QED

12.1.1 Succinct data structures

Succinct data structures are those where the space is close to the minimum, n. Specifically, we let s=n+r for some r=o(n) called redundancy. Unsurprisingly, stronger bounds can be probed in this setting. But, surprisingly, again these stronger bounds were shown to be tight. Moreover, it was shown that improving the bounds would imply stronger circuit lower bounds.

To illustrate, consider the ECC problem f:\{0,1\} ^{n}\to \{0,1\} ^{m} where f is an error-correcting code (with constant relative distance) and m is linear in n.

Theorem 12.3. [26] Any data-structure for the ECC problem with w=1 using space n+r requires time \ge cn/r.

This is nearly matched by the following result.

Theorem 12.4. [81] There is an ECC problem s.t. for any r it has a data structure with w=1, space n+r, and time c(n/r)\log ^{3}n.

Moreover, it was shown that proving a time lower bound of (n/r)\log ^{c}n would imply new circuit lower bounds. The latter result refers to bounds on the number of wires in circuits with arbitrary gates. But the following connection with the standard circuit model is also known.

Theorem 12.5. [81] Let f:\{0,1\} ^{n}\to \{0,1\} ^{am} be a function computable by bounded fan-in circuits with bm wires and depth b\log m, for constants a,b. Then f has a data structure with space n+o(n) and time n^{o(1)}.

Hence, proving n^{\epsilon } time lower bounds for succinct data structures would give functions that cannot be computed by linear-size log-depth circuits, cf. 9.3.

12.1.2 Succincter: The trits problem

In this section we present a cute and fundamental data-structure problem with a shocking and counterintuitive solution. The trits problem is to compute f:[3]^{n}\to (\{0,1\} ^{2})^{n} where on input n “trits” (i.e., ternary elements) (t_{1},t_{2},\ldots ,t_{n})\in [3]^{n} f outputs their representations using two bits per trit.

Example 12.1. For n=1, we have f(0)=00,f(1)=01,f(2)=10.

Note that the input ranges over 3^{n} elements, and so the minimum space of the data structure is s=\lceil \log _{2}3^{n}\rceil =\lceil n\log _{2}3\rceil \approx n\cdot 1.584\ldots This will be our benchmark for space. One can encode the input to f as before using bits without loss of generality, but the current choice simplifies the exposition.

Simple solutions:
  • The simplest solution (cf. 12.2) to this problem is to use 2 bits per t_{i}. With such an encoding we can retrieve each t_{i}\in [3] by reading just 2 bits (which is optimal). The space used is s=2n and we have linear redundancy.
  • Another solution (cf. again 12.2) to this problem is what is called arithmetic coding: we think of the concatenated elements as forming a ternary number between 0 and 3^{n}-1, and we write down its binary representation. To retrieve t_{i} it seems we need to read all the input bits, but the space needed is optimal.
  • For this and other problems, we can trade between these two extreme as follows. Group the t_{i}’s into blocks of t. Encode each block with arithmetic coding. The retrieval time will be ct bits and the needed space will be (n/t)\lceil t\log _{2}3\rceil \leq n\log _{2}3+n/t (assuming t divides n). This is block-wise arithmetic coding. It provides a power trade-off between retrieval time and redundancy. (Using number-theoretic results on logarithmic forms, one can show [80] that this last inequality is tight up to changing n/t into n/t^{c}.)

The shocking solution: An exponential (!) trade-off

We now present an exponential trade-off: retrieval time ct bits and redundancy n/2^{t}+c. In particular, if we set t=c\log n, we get retrieval time O(\log n) and redundancy O(1). Moreover, the bits read are all consecutive, so with word size w=\log n this can be implemented in constant time. To repeat, we can encode the trits with constant redundancy and retrieve each in constant time. This solution can also be made dynamic.

Theorem 12.6. [5321] The trits problem has a data structure with space n\log _{2}3+n/2^{t}+c (i.e., redundancy n/2^{t}+c) and time ct, for any t and with word size w=1. For word wise w=\log n the time is constant.

Next we present the proof.

Definition 12.3 (Encoding and redundancy). An encoding of a set A into a set B is a one-to-one (a.k.a. injective) map f:A\to B. The redundancy of the encoding f is \log _{2}|B|-\log _{2}|A|.

The following lemma gives the building-block encoding we will use.

Lemma 12.1. For all sets X and Y, there is an integer b, a set K and an encoding

\begin{aligned} f:\left (X\times Y\right )\rightarrow \left (\{0,1\} ^{b}\times K\right ) \end{aligned}

such that (1) f has redundancy \le c/\sqrt {|Y|}, and (2) x\in X can be recovered just by reading the b bits in f(x,y).

Note that (1) says that b+\log |K|-\log |X|-\log |Y|\le c/\sqrt {|Y|}. For (2) to hold we must have b\ge \log |X|. Combining this with the previous expression we obtain \log |K|-\log |Y|\le c/\sqrt {|Y|}. In particular we get that |K|\le 2^{c}\cdot |Y| (in fact it will be the case that |K|\le c\cdot \sqrt {|Y|}, but the looser bound is sufficient).

The basic idea for proving the lemma is to break Y into C\times K and then encode X\times C by using b bits:

\begin{aligned} X\times Y\rightarrow X\times C\times K\rightarrow \{0,1\} ^{b}\times K. \end{aligned}

There is however a subtle point. If we insist on always having |C| equal to, say, \sqrt {|Y|} or some other quantity, then one can cook up sets that make us waste a lot (i.e., almost one bit) of space. The same of course happens in the more basic approach that just sets Y=K and encodes all of X with b bits. The main idea will be to “reason backwards,” i.e., we will first pick b and then try to stuff as much as possible inside \{0,1\} ^{b}. Still, our choice of b will make |C| about \sqrt {|Y|}.

ProofPick any two sets X and Y, where |Y|>1 without loss of generality. Define b:=\left \lceil \log _{2}\left (|X|\cdot \sqrt {|Y|}\right )\right \rceil , and let B:=\{0,1\} ^{b}. To simplify notation, define d:=2^{b}/|X|. Note c\sqrt {|Y|}\le d\le c\sqrt {|Y|}.

How much can we stuff into B? For a set C of size |C|=\left \lfloor |B|/|X|\right \rfloor , we can encode elements from X\times C in B. The redundancy of such an encoding can be bounded as follows:

To calculate the total redundancy, we still need to examine the encoding from Y to C\times K. Choose K of size |K|=\left \lceil |Y|/|C|\right \rceil , so that this encoding is possible. With a calculation similar to the previous one, we see that the redundancy is:

The total redundancy is then c/\sqrt {|Y|}, which gives (1).

For (2), it is clear from the construction that any x\in X can be recovered from the element of B only. QED

Proof of Theorem 12.6 Break the ternary elements into blocks of size t: (t'_{1},t'_{2},\ldots ,t'_{n/t})\in T_{1}\times T_{2}\times \ldots \times T_{n/t}, where T_{i}=[3]^{t} for all i. The encoding, illustrated in Figure 1, is constructed as follows, where we use f_{L} to refer to the encoding guaranteed by Lemma Lemma 12.1.

Compute f_{L}(t'_{1},t'_{2})=(b_{1},k_{1})\in B_{1}\times K_{1}.

For i=2,\ldots ,n/t-1 compute f_{L}(k_{i-1},t'_{i+1}):=(b_{i},k_{i})\in B_{i}\times K_{i}.

Encode k_{n/t-1} in binary as b_{n/t} using arithmetic coding.

The final encoding is (b_{1},b_{2},\ldots ,b_{n/t}). We now compute the redundancy and retrieval time.

Redundancy: From (1) in Lemma 12.1, the first n/t-1 encodings have redundancy c3^{-t/2}\le 1/2^{ct}. For the last (arithmetic) encoding, the redundancy is at most 1. So the total redundancy is at most \displaystyle \left (\frac {n}{t}-1\right )\cdot \frac {1}{2^{ct}}+1=\frac {n}{2^{ct}}+c. One can visualize this as a “hybrid argument” transforming a product of blocks of ternary elements into a product of blocks of binary elements, one block at the time.

Retrieval Time: Say that we want to recover some t_{j} which is in block t'_{i}. To recover block t'_{i}, Lemma 12.1 guarantees that we only need to look at b_{i-1} and b_{i}. This is because k_{i-1} can be recovered by reading only b_{i}, and t'_{i} can be recovered by reading k_{i-1} and b_{i-1}. Thus to complete the proof it suffices to show that each b_{i} has length ct.

This is not completely obvious because one might have thought that the K_{i} become larger and larger, and so we apply the lemma to larger and larger inputs and the B_{i} get large too. However, recall that each |K_{i}|\le c|T_{i}|=c3^{t} from the comment after the statement of Lemma 12.1. Hence, every time we apply the lemma on an input of size at most s\le 3^{ct}. Since the lemma wastes little entropy (by (1) in Lemma 12.1), none of its outputs can be much larger than its input, and so |B_{i}|=2^{ct}.


PIC

Figure 12.1: Succinct Encoding


QED

12.2 Dynamic data structures

We now study dynamic data structures. As we mentioned, here the input is not fixed but can be modified by the queries.

Definition 12.4. Fix an error-correcting code \text {ECC}:\{0,1\} ^{n}\to \{0,1\} ^{m} where m\le cn and \Delta (\text {ECC}(x),\text {ECC}(y))\ge c for any x\ne y in \{0,1\} ^{n}. Here \Delta (u,v) is the relative distance, the fraction of bit positions where u and v differ.

The \text {ECC} problem asks to support operations, starting with the all-zero message:

M(i,b) for i\in \{1,2,\ldots ,n\} and b\in \{0,1\} which sets bit i of the message to b, and

C(i) for i\in \{1,2,\ldots ,m\} which returns bit i of the codeword corresponding to the current message.

The time of a dynamic data structure is the maximum number of read/write operations in memory cells required to support an operation.

Theorem 12.7. The \text {ECC} problem requires time t\ge c\log _{w}n\ge (c\log n)/\log \log n for cell size w:=\log n.

One might wonder if stronger bounds can be shown for this problem. But in fact there exist codes for which the bounds are nearly tight.

Theorem 12.8. [81]There exists codes for which the ECC problem can be solved in time c\log ^{2}n with cell size w=1.

The technique in the proof of Theorem 12.7 is from [24] and can be applied to many other natural problems, leading to tight results in several cases, see Exercise ??. It is not far from the state-of-the art in this area, which is \log ^{1+c}n [42].

Proof of Theorem 12.7 Pick x\in \{0,1\} ^{n} uniformly and i\in \{1,2,\ldots ,m\} uniformly, and consider the sequence of operations

\begin{aligned} M(1,x_{1}),M(2,x_{2}),\ldots ,M(n,x_{n}),C(i). \end{aligned}

That is, we set the message to a uniform x one bit at a time, and then ask for a uniformly selected bit of the codeword \text {ECC}(x), which we also denote by C_{x}=C_{x}(1),C_{x}(2),\ldots ,C_{x}(n).

We divide the n operations M(i,x_{i}) into consecutive blocks, called epochs. Epoch e consists of n/w^{3e} operations. Hence we can have at least E:=c\log _{w}n epochs, and we can assume that we have exactly this many epochs (by discarding some bits of n if necessary).

The geometrically decaying size of epochs is chosen so that the number of message bits set during an epoch e is much more than all the cells written by the data structure in future epochs.

A key idea of the proof is to see what happens when the cells written during a certain epoch are ignored, or reverted to their contents right before the epoch. Specifically, after the execution of the M operations, we can associate to each memory cell the last epoch during which this cell was written. Let D^{e}(x) denote the memory cells of the data structure after the first n operations M, but with the change that the cells that were written last during epoch e are replaced with their contents right before epoch e. Define C_{x}^{e}(i) to be the result of the data structure algorithm for C(i) on D^{e}(x), and C_{x}^{e}=C_{x}^{e}(1),C_{x}^{e}(2),\ldots ,C_{x}^{e}(n).

Let t(x,i,e) equal 1 if C(i), executed after the first n operations M, reads a cell that was last written in epoch e, and 0 otherwise. We have

\begin{aligned} t\ge \max _{x,i}\sum _{e}t(x,i,e)\ge \mathbb {E}_{x,i}\sum _{e}t(x,i,e)=\sum _{e}\mathbb {E}_{x,i}t(x,i,e)\ge \sum _{e}\mathbb {E}_{x}\Delta (C_{x},C_{x}^{e}),~~~~(12.1) \end{aligned}

where the last inequality holds because C_{x}^{e}(i)\ne C_{x}(i) implies t(x,i,e)\ge 1.

We now claim that if t\le w then \mathbb {E}_{x}\Delta (C_{x},C_{x}^{e})\ge c for every e. This concludes the proof.

In the remainder we justify the claim. Fix arbitrarily the bits of x set before Epoch e. For a uniform setting of the remaining bits of x, note that the message ranges over at least

\begin{aligned} 2^{n/w^{3e}} \end{aligned}

codewords. On the other hand, we claim that C_{x}^{e} ranges over much fewer strings. Indeed, the total number of cells written in all epochs after e is at most

\begin{aligned} t\sum _{i\ge e+1}n/w^{3i}\le ctn/w^{3(e+1)}. \end{aligned}

We can describe all these cells by writing down their indices and contents using B:=ctn/w^{3e+2} bits. Note that this information can depend on the operations performed during Epoch e, but the point is that it takes few possible values overall. Since the cells last changed during Epoch e are reverted to their contents before Epoch e, this information suffices to describe D^{e}(x), and hence C_{x}^{e}. Therefore, C_{x}^{e} ranges over \le 2^{B} strings.

For each string in the range of C_{x}^{e} at most two codewords can have relative distance \le c, for else you’d have two codewords at distance \le 2c, violating the distance of the code.

Hence except with probability 2\cdot 2^{B}/2^{n/w^{3e}} over x, we have \Delta (C_{x},C_{x}^{e})\ge c. If t_{M}\le w then the first probability is \le 0.1, and so \mathbb {E}_{x}\Delta (C_{x},C_{x}^{e})\ge c, proving the claim. QED

Exercise 12.4. Explain how to conclude the proof given the claim.

12.3 Notes

The exposition of the trits problem is from [76].

References

[1]   Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Tight hardness results for LCS and other sequence similarity measures. In Venkatesan Guruswami, editor, IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pages 59–78. IEEE Computer Society, 2015.

[2]   Leonard Adleman. Two theorems on random polynomial time. In 19th IEEE Symp. on Foundations of Computer Science (FOCS), pages 75–83. 1978.

[3]   Mikl�s Ajtai. \Sigma \sp {1}\sb {1}-formulae on finite structures. Annals of Pure and Applied Logic, 24(1):1–48, 1983.

[4]   Mikl�s Ajtai. A non-linear time lower bound for boolean branching programs. Theory of Computing, 1(1):149–176, 2005.

[5]   Eric Allender. A note on the power of threshold circuits. In 30th Symposium on Foundations of Computer Science, pages 580–584, Research Triangle Park, North Carolina, 30 October–1 November 1989. IEEE.

[6]   Eric Allender. The division breakthroughs. Bulletin of the EATCS, 74:61–77, 2001.

[7]   Eric Allender and Michal Koucký. Amplifying lower bounds by means of self-reducibility. J. of the ACM, 57(3), 2010.

[8]   Noga Alon, Oded Goldreich, Johan H�stad, and Ren� Peralta. Simple constructions of almost k-wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992.

[9]   Dana Angluin and Leslie G. Valiant. Fast probabilistic algorithms for hamiltonian circuits and matchings. J. Comput. Syst. Sci., 18(2):155–193, 1979.

[10]   Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. of the ACM, 45(3):501–555, May 1998.

[11]   Arturs Backurs and Piotr Indyk. Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). SIAM J. Comput., 47(3):1087–1097, 2018.

[12]   Kenneth E. Batcher. Sorting networks and their applications. In AFIPS Spring Joint Computing Conference, volume 32, pages 307–314, 1968.

[13]   Paul Beame, Stephen A. Cook, and H. James Hoover. Log depth circuits for division and related problems. SIAM J. Comput., 15(4):994–1003, 1986.

[14]   Paul Beame, Michael Saks, Xiaodong Sun, and Erik Vee. Time-space trade-off lower bounds for randomized computation of decision problems. J. of the ACM, 50(2):154–195, 2003.

[15]   Michael Ben-Or and Richard Cleve. Computing algebraic formulas using a constant number of registers. SIAM J. on Computing, 21(1):54–58, 1992.

[16]   Samuel R. Buss and Ryan Williams. Limits on alternation trading proofs for time-space lower bounds. Comput. Complex., 24(3):533–600, 2015.

[17]   Lijie Chen and Roei Tell. Bootstrapping results for threshold circuits “just beyond” known lower bounds. In Moses Charikar and Edith Cohen, editors, Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23-26, 2019, pages 34–41. ACM, 2019.

[18]   Richard Cleve. Towards optimal simulations of formulas by bounded-width programs. Computational Complexity, 1:91–105, 1991.

[19]   Stephen A. Cook. A hierarchy for nondeterministic time complexity. J. of Computer and System Sciences, 7(4):343–353, 1973.

[20]   L. Csanky. Fast parallel matrix inversion algorithms. SIAM J. Comput., 5(4):618–623, 1976.

[21]   Yevgeniy Dodis, Mihai Pǎtraşcu, and Mikkel Thorup. Changing base without losing space. In 42nd ACM Symp. on the Theory of Computing (STOC), pages 593–602. ACM, 2010.

[22]   Lance Fortnow. Time-space tradeoffs for satisfiability. J. Comput. Syst. Sci., 60(2):337–353, 2000.

[23]   Aviezri S. Fraenkel and David Lichtenstein. Computing a perfect strategy for n x n chess requires time exponential in n. J. Comb. Theory, Ser. A, 31(2):199–214, 1981.

[24]   Michael L. Fredman and Michael E. Saks. The cell probe complexity of dynamic data structures. In ACM Symp. on the Theory of Computing (STOC), pages 345–354, 1989.

[25]   Anka Gajentaan and Mark H. Overmars. On a class of {O}(n^2) problems in computational geometry. Comput. Geom., 5:165–185, 1995.

[26]   Anna G�l and Peter Bro Miltersen. The cell probe complexity of succinct data structures. Theoretical Computer Science, 379(3):405–417, 2007.

[27]   M. R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.

[28]   K. G�del. �ber formal unentscheidbare s�tze der Principia Mathematica und verwandter systeme I. Monatsh. Math. Phys., 38, 1931.

[29]   Oded Goldreich. Computational Complexity: A Conceptual Perspective. Cambridge University Press, 2008.

[30]   Raymond Greenlaw, H. James Hoover, and Walter Ruzzo. Limits to Parallel Computation: P-Completeness Theory. 02 2001.

[31]   David Harvey and Joris van der Hoeven. Integer multiplication in time O(n\mathrm {log}\, n). Annals of Mathematics, 193(2):563 – 617, 2021.

[32]   F. C. Hennie. One-tape, off-line turing machine computations. Information and Control, 8(6):553–578, 1965.

[33]   Fred Hennie and Richard Stearns. Two-tape simulation of multitape turing machines. J. of the ACM, 13:533–546, October 1966.

[34]   John E. Hopcroft, Wolfgang J. Paul, and Leslie G. Valiant. On time versus space. J. ACM, 24(2):332–337, 1977.

[35]   Russell Impagliazzo and Ramamohan Paturi. The complexity of k-sat. In IEEE Conf. on Computational Complexity (CCC), pages 237–, 1999.

[36]   Russell Impagliazzo, Ramamohan Paturi, and Michael E. Saks. Size-depth tradeoffs for threshold circuits. SIAM J. Comput., 26(3):693–707, 1997.

[37]   Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which problems have strongly exponential complexity? J. Computer & Systems Sciences, 63(4):512–530, Dec 2001.

[38]   Russell Impagliazzo and Avi Wigderson. \mathit {P} = \mathit {BPP} if E requires exponential circuits: Derandomizing the XOR lemma. In 29th ACM Symp. on the Theory of Computing (STOC), pages 220–229. ACM, 1997.

[39]   Richard M. Karp and Richard J. Lipton. Turing machines that take advice. L’Enseignement Math�matique. Revue Internationale. IIe S�rie, 28(3-4):191–209, 1982.

[40]   Kojiro Kobayashi. On the structure of one-tape nondeterministic turing machine time hierarchy. Theor. Comput. Sci., 40:175–193, 1985.

[41]   Swastik Kopparty and Srikanth Srinivasan. Certifying polynomials for AC\({}^{\mbox {0}}\)[\(\oplus \)] circuits, with applications to lower bounds and circuit compression. Theory of Computing, 14(1):1–24, 2018.

[42]   Kasper Green Larsen, Omri Weinstein, and Huacheng Yu. Crossing the logarithmic barrier for dynamic boolean data structure lower bounds. SIAM J. Comput., 49(5), 2020.

[43]   Leonid A. Levin. Universal sequential search problems. Problemy Peredachi Informatsii, 9(3):115–116, 1973.

[44]   Carsten Lund, Lance Fortnow, Howard Karloff, and Noam Nisan. Algebraic methods for interactive proof systems. J. of the ACM, 39(4):859–868, October 1992.

[45]   O. B. Lupanov. A method of circuit synthesis. Izv. VUZ Radiofiz., 1:120–140, 1958.

[46]   Wolfgang Maass and Amir Schorr. Speed-up of Turing machines with one work tape and a two-way input tape. SIAM J. on Computing, 16(1):195–202, 1987.

[47]   David A. Mix Barrington. Bounded-width polynomial-size branching programs recognize exactly those languages in NC1. J. of Computer and System Sciences, 38(1):150–164, 1989.

[48]   Joseph Naor and Moni Naor. Small-bias probability spaces: efficient constructions and applications. SIAM J. on Computing, 22(4):838–856, 1993.

[49]   E. I. Nechiporuk. A boolean function. Soviet Mathematics-Doklady, 169(4):765–766, 1966.

[50]   Valery A. Nepomnjaščiĭ. Rudimentary predicates and Turing calculations. Soviet Mathematics-Doklady, 11(6):1462–1465, 1970.

[51]   NEU. From RAM to SAT. Available at http://www.ccs.neu.edu/home/viola/, 2012.

[52]   Christos H. Papadimitriou and Mihalis Yannakakis. Optimization, approximation, and complexity classes. J. Comput. Syst. Sci., 43(3):425–440, 1991.

[53]   Mihai Pǎtraşcu. Succincter. In 49th IEEE Symp. on Foundations of Computer Science (FOCS). IEEE, 2008.

[54]   Wolfgang J. Paul, Nicholas Pippenger, Endre Szemer�di, and William T. Trotter. On determinism versus non-determinism and related problems (preliminary version). In IEEE Symp. on Foundations of Computer Science (FOCS), pages 429–438, 1983.

[55]   Nicholas Pippenger and Michael J. Fischer. Relations among complexity measures. J. of the ACM, 26(2):361–381, 1979.

[56]   Alexander Razborov. Lower bounds on the dimension of schemes of bounded depth in a complete basis containing the logical addition function. Akademiya Nauk SSSR. Matematicheskie Zametki, 41(4):598–607, 1987. English translation in Mathematical Notes of the Academy of Sci. of the USSR, 41(4):333-338, 1987.

[57]   Omer Reingold. Undirected connectivity in log-space. J. of the ACM, 55(4), 2008.

[58]   J. M. Robson. N by N checkers is exptime complete. SIAM J. Comput., 13(2):252–267, 1984.

[59]   Rahul Santhanam. On separators, segregators and time versus space. In IEEE Conf. on Computational Complexity (CCC), pages 286–294, 2001.

[60]   Walter J. Savitch. Relationships between nondeterministic and deterministic tape complexities. Journal of Computer and System Sciences, 4(2):177–192, 1970.

[61]   Arnold Sch�nhage. Storage modification machines. SIAM J. Comput., 9(3):490–508, 1980.

[62]   Adi Shamir. IP = PSPACE. J. of the ACM, 39(4):869–877, October 1992.

[63]   Claude E. Shannon. The synthesis of two-terminal switching circuits. Bell System Tech. J., 28:59–98, 1949.

[64]   Victor Shoup. New algorithms for finding irreducible polynomials over finite fields. Mathematics of Computation, 54(189):435–447, 1990.

[65]   Alan Siegel. On universal classes of extremely random constant-time hash functions. SIAM J. on Computing, 33(3):505–543, 2004.

[66]   Michael Sipser. A complexity theoretic approach to randomness. In ACM Symp. on the Theory of Computing (STOC), pages 330–335, 1983.

[67]   Roman Smolensky. Algebraic methods in the theory of lower bounds for Boolean circuit complexity. In 19th ACM Symp. on the Theory of Computing (STOC), pages 77–82. ACM, 1987.

[68]   Larry Stockmeyer and Albert R. Meyer. Cosmological lower bound on the circuit complexity of a small problem in logic. J. ACM, 49(6):753–784, 2002.

[69]   Seinosuke Toda. PP is as hard as the polynomial-time hierarchy. SIAM J. on Computing, 20(5):865–877, 1991.

[70]   Alan M. Turing. On computable numbers, with an application to the entscheidungsproblem. Proc. London Math. Soc., s2-42(1):230–265, 1937.

[71]   Leslie G. Valiant. Graph-theoretic arguments in low-level complexity. In 6th Symposium on Mathematical Foundations of Computer Science, volume 53 of Lecture Notes in Computer Science, pages 162–176. Springer, 1977.

[72]   Leslie G. Valiant and Vijay V. Vazirani. NP is as easy as detecting unique solutions. Theor. Comput. Sci., 47(3):85–93, 1986.

[73]   Dieter van Melkebeek. A survey of lower bounds for satisfiability and related problems. Foundations and Trends in Theoretical Computer Science, 2(3):197–303, 2006.

[74]   Dieter van Melkebeek and Ran Raz. A time lower bound for satisfiability. Theor. Comput. Sci., 348(2-3):311–320, 2005.

[75]   N. V. Vinodchandran. A note on the circuit complexity of PP. Theor. Comput. Sci., 347(1-2):415–418, 2005.

[76]   Emanuele Viola. Gems of theoretical computer science. Lecture notes of the class taught at Northeastern University. Available at http://www.ccs.neu.edu/home/viola/classes/gems-08/index.html, 2009.

[77]   Emanuele Viola. On approximate majority and probabilistic time. Computational Complexity, 18(3):337–375, 2009.

[78]   Emanuele Viola. On the power of small-depth computation. Foundations and Trends in Theoretical Computer Science, 5(1):1–72, 2009.

[79]   Emanuele Viola. Reducing 3XOR to listing triangles, an exposition. Available at http://www.ccs.neu.edu/home/viola/, 2011.

[80]   Emanuele Viola. Bit-probe lower bounds for succinct data structures. SIAM J. on Computing, 41(6):1593–1604, 2012.

[81]   Emanuele Viola. Lower bounds for data structures with space close to maximum imply circuit lower bounds. Theory of Computing, 15:1–9, 2019. Available at http://www.ccs.neu.edu/home/viola/.

[82]   Emanuele Viola. Pseudorandom bits and lower bounds for randomized turing machines. Theory of Computing, 18(10):1–12, 2022.

Mathematics of the impossible, Chapter 11, Proofs

The notion of proof is pervasive. We have seen many proofs in this book until now. But the notion extends to others realms of knowledge, including empirical science, law, and more. Complexity theory has contributed a great deal to the notion of proof, with important applications in several areas such as cryptography.

11.1 Static proofs

As remarked in section 5.1.0.1, we can think of problems in \text {NP} as those admitting a solution that can be verified efficiently, namely in \text {P}. Let us repeat the definition of \text {NP} using the suggestive letter V for verifier.

Definition 11.1. A function f:X\subseteq \{0,1\}^* \to \{0,1\} is in \text {NP} iff there is V\in \text {P} (called “verifier”) and d\in \mathbb {N} s.t.:

\begin{aligned} f(x)=1\Leftrightarrow \exists y\in \{0,1\} ^{|x|^{d}}:V(x,y)=1. \end{aligned}

We are naturally interested in fast proof verification, and especially the complexity of V. It turns out that proofs can be encoded in a format that allows for very efficient verification. This message is already in the following.

Theorem 11.1. For any input length n, V in Definition 11.1 can be taken to be a 3CNF of size n^{d}.

That is, whereas when defining \text {NP} as a proof system we considered arbitrary verifiers V in P, in fact the definition is unchanged if one selects a very restricted class of verifiers: small 3CNFs.

ProofThis is just a restatement of Theorem 5.1. QED

This extreme reduction in the verifier’s complexity is possible because we are allowing proofs to be long, longer than the original verifier’s running time. If we don’t allow for that, such a reduction is not known. Such “bounded proofs” are very interesting to study, but we shall not do so now.

Instead, we ask for more. The 3CNF in the above theorem still depends on the entire proof. We can ask for a verifier that only depends on few bits of the proof. Taking this to the extreme, we can ask whether V can only read a constant number of bits from y. Without randomness, this is impossible.

Exercise 11.1. Suppose V in Definition 11.1 only reads \le d bits of y, for a constant d. Show that the corresponding class would be the same as P.

Surprisingly, if we allow randomness this is possible. Moreover, the use of randomness is fairly limited – only logarithmically many bits – yielding the following central characterization.

Theorem 11.2. A function f:X\subseteq \{0,1\}^* \to \{0,1\} is in \text {NP} iff there is V\in \text {P} and d\in \mathbb {N} s.t.:

f(x)=1\Rightarrow \exists y\in \{0,1\} ^{|x|^{d}}:\mathbb {P}_{r\in \{0,1\} ^{d\log |x|}}[V(x,y,r)=1]=1,

f(x)=0\Rightarrow \forall y\in \{0,1\} ^{|x|^{d}}:\mathbb {P}_{r\in \{0,1\} ^{d\log |x|}}[V(x,y,r)=1]<0.01,

and moreover V reads \le d bits of y.

Exercise 11.2. Prove the “only if” in Theorem 11.2 in the specific case f=0.01\text {-Gap-3Sat}.

Given this exercise, the “only if” direction for any problem in NP follows from the advanced result that any problem in NP can be map reduced to 0.01-Gap-3Sat (which is essentially Theorem 4.7, except we did not claim map reductions or a specific constant there).

Exercise 11.3. Prove the “if” in Theorem 11.2.

11.2 Zero-knowledge

In Theorem 11.2 the verifier gains “constant confidence” about the validity of the proof, just be inspecting a constant number of bits. Hence the verifier “learns” at most a constant number of bits of the proof. This is remarkable, but we can further ask if we can modify the proof so that the verifier “learns nothing” about the proof. Such proofs are called zero knowledge and are extensively studied and applied.

We sketch how this is done for Gap-3Color, which is also NP-complete. Rather than a single proof y, now the verifier will receive a random proof Y. This Y is obtained from a 3 coloring y by randomly permuting colors (so for any y the corresponding Y is uniform over 6 colorings). The verifier will pick a random edge and inspect the corresponding endpoints, and accept if they are different.

The verifier learn nothing because all that they see is two random different color. One can formalize “learning nothing” by noting that the verifier can produce this distribution by themselves, without looking at the proof. (So why does the verifier gain anything from y? The fact that a proof y has been written down means that colors have been picked so that every two endpoints are uniform colors, something that the verifier is not easily able to reproduce.)

This gives a zero-knowledge proof for verifiers that follow the protocol of just inspecting an edge. In a cryptographic setting one has to worry about verifiers which don’t follow the protocol. Using cryptographic assumptions, one can force the verifiers to follow the protocol by considering an interactive proof: First a proof y is committed to but not revealed, then the verifier selects an edge to inspect, and only then the corresponding colors are revealed, and only those. This protocol lends itself to a physical implementation.

11.3 Interactive proofs

We now consider interactive proofs. Here the verifier V engages in a protocol with a prover P. Given an input x to both V and P, the verifier asks questions, the prover replies, the verifier asks more questions, and so on. The case of NP corresponds to the prover simply sending y to V.

It turns out that it suffices for the verifier to send uniformly random strings Q_{1},Q_{2},\ldots bits to P. This leads to a simple definition.

Definition 11.2. A function f:X\subseteq \{0,1\}^* \to \{0,1\} admits an efficient interactive proof, abbreviated IP, if there is V\in \text {P} and d\in \mathbb {N} such that for every x\in \{0,1\} ^{n}, letting b:=n^{d}:

  • If f(x)=1 then \exists P:\{0,1\}^* \to \{0,1\} ^{b} such that
    \begin{aligned} V\left (P(Q_{1}),P(Q_{1},Q_{2}),\ldots ,P(Q_{1},Q_{2},\ldots ,Q_{b})\right )=1 \end{aligned}

    for every Q_{1},Q_{2},\ldots ,Q_{b}\in \{0,1\} ^{b}.

  • If f(x)=0 then \forall P:\{0,1\}^* \to \{0,1\} ^{b} we have
    \begin{aligned} \mathbb {P}_{Q_{1},Q_{2},\ldots ,Q_{b}\in \{0,1\} ^{b}}\left [V\left (P(Q_{1}),P(Q_{1},Q_{2}),\ldots ,P(Q_{1},Q_{2},\ldots ,Q_{b})\right )=1\right ]\le 1/3. \end{aligned}

The following amazing result shows the power of interactive proofs, compared to non-interactive proofs. We can think of NP as “reading a book” and IP as “going to class and asking questions.” We don’t yet know how to replace teachers with books.

Theorem 11.3. [4158]\text { IP}=\text {PSpace}.

In the rest of this section we present the main ideas in the proof of 11.3, establishing a weaker result. In particular we show that IP contains problems not known to be in NP.

Theorem 11.4. Given a field \mathbb {F}, an arithmetic circuit C(x_{1},x_{2},\ldots ,x_{v}) over \mathbb {F} computing a polynomial of degree d, and an element s\in \mathbb {F}, deciding if

\begin{aligned} \sum _{x_{1},x_{2},\ldots ,x_{v}\in \{0,1\} }C(x_{1},x_{2},\ldots ,x_{v})=s~~~~(11.1) \end{aligned}

is in IP, whenever (1-d/q)^{v}\ge 2/3.

ProofIf v=1 then V can decide this question by itself, by evaluating the circuit. For larger v we give a way to reduce v by 1.

As the first prover answer, V expects a polynomial p of degree d in the variable x, which is meant to be

\begin{aligned} s'(x):=\sum _{x_{2},x_{3},\ldots ,x_{n}\in \{0,1\} }C(x,x_{2},x_{3}\ldots ,x_{n}). \end{aligned}

V checks if p(0)+p(1)=s, and if not rejects. Otherwise, it recursively runs the protocol to verify that

\begin{aligned} \sum _{x_{2},x_{3},\ldots ,x_{n}\in \{0,1\} }C(Q_{1},x_{2},x_{3},\ldots ,x_{n})=p(Q_{1}).~~~~(11.2) \end{aligned}

This concludes the description of the protocol. We now verify its correctness.

In case equation (??) is true, P can send polynomials that cause V to accept.

In case equation (??) is false, s'(0)+s'(1)\ne s. Hence, unless V rejects right away because p(0)+p(1)\ne s, we have p\ne s'. The polynomials p and s' have degree \le d. Hence by Lemma 2.3

\begin{aligned} \mathbb {P}_{Q_{1}}[p(Q_{1})\ne s'(Q_{1})]\ge 1-d/q. \end{aligned}

When this event occurs, equation (??) is again false, and we can repeat the argument. Overall, the probability that we maintain a false statement throughout the protocol is \ge (1-d/q)^{v}. QED

Corollary 11.1. Given a 3CNF formula \phi and k\in \mathbb {N}, deciding if \phi has exactly k satisfying assignments is in \text {IP}.

The proof uses a far-reaching technique: arithmetization. We construct an arithmetic circuit C_{\phi } over a field \mathbb {F} which agrees with \phi on boolean inputs, but that can then be evaluated over other elements of the field.

Exercise 11.4. Prove Corollary 11.1.

The study of interactive proofs is rich. Many aspects are of interest, including:

  • The efficiency of the prover (does it have to be unbounded, randomized, etc.), and of the verifier.
  • The number of rounds.
  • The tradeoff betwen the error and the other parameters.

References

[1]   Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Tight hardness results for LCS and other sequence similarity measures. In Venkatesan Guruswami, editor, IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pages 59–78. IEEE Computer Society, 2015.

[2]   Leonard Adleman. Two theorems on random polynomial time. In 19th IEEE Symp. on Foundations of Computer Science (FOCS), pages 75–83. 1978.

[3]   Mikl�s Ajtai. \Sigma \sp {1}\sb {1}-formulae on finite structures. Annals of Pure and Applied Logic, 24(1):1–48, 1983.

[4]   Mikl�s Ajtai. A non-linear time lower bound for boolean branching programs. Theory of Computing, 1(1):149–176, 2005.

[5]   Eric Allender. A note on the power of threshold circuits. In 30th Symposium on Foundations of Computer Science, pages 580–584, Research Triangle Park, North Carolina, 30 October–1 November 1989. IEEE.

[6]   Eric Allender. The division breakthroughs. Bulletin of the EATCS, 74:61–77, 2001.

[7]   Eric Allender and Michal Koucký. Amplifying lower bounds by means of self-reducibility. J. of the ACM, 57(3), 2010.

[8]   Noga Alon, Oded Goldreich, Johan H�stad, and Ren� Peralta. Simple constructions of almost k-wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992.

[9]   Dana Angluin and Leslie G. Valiant. Fast probabilistic algorithms for hamiltonian circuits and matchings. J. Comput. Syst. Sci., 18(2):155–193, 1979.

[10]   Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. of the ACM, 45(3):501–555, May 1998.

[11]   Arturs Backurs and Piotr Indyk. Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). SIAM J. Comput., 47(3):1087–1097, 2018.

[12]   Kenneth E. Batcher. Sorting networks and their applications. In AFIPS Spring Joint Computing Conference, volume 32, pages 307–314, 1968.

[13]   Paul Beame, Stephen A. Cook, and H. James Hoover. Log depth circuits for division and related problems. SIAM J. Comput., 15(4):994–1003, 1986.

[14]   Paul Beame, Michael Saks, Xiaodong Sun, and Erik Vee. Time-space trade-off lower bounds for randomized computation of decision problems. J. of the ACM, 50(2):154–195, 2003.

[15]   Michael Ben-Or and Richard Cleve. Computing algebraic formulas using a constant number of registers. SIAM J. on Computing, 21(1):54–58, 1992.

[16]   Samuel R. Buss and Ryan Williams. Limits on alternation trading proofs for time-space lower bounds. Comput. Complex., 24(3):533–600, 2015.

[17]   Lijie Chen and Roei Tell. Bootstrapping results for threshold circuits “just beyond” known lower bounds. In Moses Charikar and Edith Cohen, editors, Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23-26, 2019, pages 34–41. ACM, 2019.

[18]   Richard Cleve. Towards optimal simulations of formulas by bounded-width programs. Computational Complexity, 1:91–105, 1991.

[19]   Stephen A. Cook. A hierarchy for nondeterministic time complexity. J. of Computer and System Sciences, 7(4):343–353, 1973.

[20]   L. Csanky. Fast parallel matrix inversion algorithms. SIAM J. Comput., 5(4):618–623, 1976.

[21]   Lance Fortnow. Time-space tradeoffs for satisfiability. J. Comput. Syst. Sci., 60(2):337–353, 2000.

[22]   Aviezri S. Fraenkel and David Lichtenstein. Computing a perfect strategy for n x n chess requires time exponential in n. J. Comb. Theory, Ser. A, 31(2):199–214, 1981.

[23]   Michael L. Fredman and Michael E. Saks. The cell probe complexity of dynamic data structures. In ACM Symp. on the Theory of Computing (STOC), pages 345–354, 1989.

[24]   Anka Gajentaan and Mark H. Overmars. On a class of {O}(n^2) problems in computational geometry. Comput. Geom., 5:165–185, 1995.

[25]   M. R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.

[26]   K. G�del. �ber formal unentscheidbare s�tze der Principia Mathematica und verwandter systeme I. Monatsh. Math. Phys., 38, 1931.

[27]   Oded Goldreich. Computational Complexity: A Conceptual Perspective. Cambridge University Press, 2008.

[28]   Raymond Greenlaw, H. James Hoover, and Walter Ruzzo. Limits to Parallel Computation: P-Completeness Theory. 02 2001.

[29]   David Harvey and Joris van der Hoeven. Integer multiplication in time O(n\mathrm {log}\, n). Annals of Mathematics, 193(2):563 – 617, 2021.

[30]   F. C. Hennie. One-tape, off-line turing machine computations. Information and Control, 8(6):553–578, 1965.

[31]   Fred Hennie and Richard Stearns. Two-tape simulation of multitape turing machines. J. of the ACM, 13:533–546, October 1966.

[32]   John E. Hopcroft, Wolfgang J. Paul, and Leslie G. Valiant. On time versus space. J. ACM, 24(2):332–337, 1977.

[33]   Russell Impagliazzo and Ramamohan Paturi. The complexity of k-sat. In IEEE Conf. on Computational Complexity (CCC), pages 237–, 1999.

[34]   Russell Impagliazzo, Ramamohan Paturi, and Michael E. Saks. Size-depth tradeoffs for threshold circuits. SIAM J. Comput., 26(3):693–707, 1997.

[35]   Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which problems have strongly exponential complexity? J. Computer & Systems Sciences, 63(4):512–530, Dec 2001.

[36]   Russell Impagliazzo and Avi Wigderson. \mathit {P} = \mathit {BPP} if E requires exponential circuits: Derandomizing the XOR lemma. In 29th ACM Symp. on the Theory of Computing (STOC), pages 220–229. ACM, 1997.

[37]   Richard M. Karp and Richard J. Lipton. Turing machines that take advice. L’Enseignement Math�matique. Revue Internationale. IIe S�rie, 28(3-4):191–209, 1982.

[38]   Kojiro Kobayashi. On the structure of one-tape nondeterministic turing machine time hierarchy. Theor. Comput. Sci., 40:175–193, 1985.

[39]   Kasper Green Larsen, Omri Weinstein, and Huacheng Yu. Crossing the logarithmic barrier for dynamic boolean data structure lower bounds. SIAM J. Comput., 49(5), 2020.

[40]   Leonid A. Levin. Universal sequential search problems. Problemy Peredachi Informatsii, 9(3):115–116, 1973.

[41]   Carsten Lund, Lance Fortnow, Howard Karloff, and Noam Nisan. Algebraic methods for interactive proof systems. J. of the ACM, 39(4):859–868, October 1992.

[42]   O. B. Lupanov. A method of circuit synthesis. Izv. VUZ Radiofiz., 1:120–140, 1958.

[43]   Wolfgang Maass and Amir Schorr. Speed-up of Turing machines with one work tape and a two-way input tape. SIAM J. on Computing, 16(1):195–202, 1987.

[44]   David A. Mix Barrington. Bounded-width polynomial-size branching programs recognize exactly those languages in NC^1. J. of Computer and System Sciences, 38(1):150–164, 1989.

[45]   Joseph Naor and Moni Naor. Small-bias probability spaces: efficient constructions and applications. SIAM J. on Computing, 22(4):838–856, 1993.

[46]   E. I. Nechiporuk. A boolean function. Soviet Mathematics-Doklady, 169(4):765–766, 1966.

[47]   Valery A. Nepomnjaščiĭ. Rudimentary predicates and Turing calculations. Soviet Mathematics-Doklady, 11(6):1462–1465, 1970.

[48]   NEU. From RAM to SAT. Available at http://www.ccs.neu.edu/home/viola/, 2012.

[49]   Christos H. Papadimitriou and Mihalis Yannakakis. Optimization, approximation, and complexity classes. J. Comput. Syst. Sci., 43(3):425–440, 1991.

[50]   Wolfgang J. Paul, Nicholas Pippenger, Endre Szemer�di, and William T. Trotter. On determinism versus non-determinism and related problems (preliminary version). In IEEE Symp. on Foundations of Computer Science (FOCS), pages 429–438, 1983.

[51]   Nicholas Pippenger and Michael J. Fischer. Relations among complexity measures. J. of the ACM, 26(2):361–381, 1979.

[52]   Alexander Razborov. Lower bounds on the dimension of schemes of bounded depth in a complete basis containing the logical addition function. Akademiya Nauk SSSR. Matematicheskie Zametki, 41(4):598–607, 1987. English translation in Mathematical Notes of the Academy of Sci. of the USSR, 41(4):333-338, 1987.

[53]   Omer Reingold. Undirected connectivity in log-space. J. of the ACM, 55(4), 2008.

[54]   J. M. Robson. N by N checkers is exptime complete. SIAM J. Comput., 13(2):252–267, 1984.

[55]   Rahul Santhanam. On separators, segregators and time versus space. In IEEE Conf. on Computational Complexity (CCC), pages 286–294, 2001.

[56]   Walter J. Savitch. Relationships between nondeterministic and deterministic tape complexities. Journal of Computer and System Sciences, 4(2):177–192, 1970.

[57]   Arnold Sch�nhage. Storage modification machines. SIAM J. Comput., 9(3):490–508, 1980.

[58]   Adi Shamir. IP = PSPACE. J. of the ACM, 39(4):869–877, October 1992.

[59]   Claude E. Shannon. The synthesis of two-terminal switching circuits. Bell System Tech. J., 28:59–98, 1949.

[60]   Victor Shoup. New algorithms for finding irreducible polynomials over finite fields. Mathematics of Computation, 54(189):435–447, 1990.

[61]   Alan Siegel. On universal classes of extremely random constant-time hash functions. SIAM J. on Computing, 33(3):505–543, 2004.

[62]   Michael Sipser. A complexity theoretic approach to randomness. In ACM Symp. on the Theory of Computing (STOC), pages 330–335, 1983.

[63]   Roman Smolensky. Algebraic methods in the theory of lower bounds for Boolean circuit complexity. In 19th ACM Symp. on the Theory of Computing (STOC), pages 77–82. ACM, 1987.

[64]   Larry Stockmeyer and Albert R. Meyer. Cosmological lower bound on the circuit complexity of a small problem in logic. J. ACM, 49(6):753–784, 2002.

[65]   Seinosuke Toda. PP is as hard as the polynomial-time hierarchy. SIAM J. on Computing, 20(5):865–877, 1991.

[66]   Alan M. Turing. On computable numbers, with an application to the entscheidungsproblem. Proc. London Math. Soc., s2-42(1):230–265, 1937.

[67]   Leslie G. Valiant. Graph-theoretic arguments in low-level complexity. In 6th Symposium on Mathematical Foundations of Computer Science, volume 53 of Lecture Notes in Computer Science, pages 162–176. Springer, 1977.

[68]   Leslie G. Valiant and Vijay V. Vazirani. NP is as easy as detecting unique solutions. Theor. Comput. Sci., 47(3):85–93, 1986.

[69]   Dieter van Melkebeek. A survey of lower bounds for satisfiability and related problems. Foundations and Trends in Theoretical Computer Science, 2(3):197–303, 2006.

[70]   Dieter van Melkebeek and Ran Raz. A time lower bound for satisfiability. Theor. Comput. Sci., 348(2-3):311–320, 2005.

[71]   N. V. Vinodchandran. A note on the circuit complexity of PP. Theor. Comput. Sci., 347(1-2):415–418, 2005.

[72]   Emanuele Viola. On approximate majority and probabilistic time. Computational Complexity, 18(3):337–375, 2009.

[73]   Emanuele Viola. On the power of small-depth computation. Foundations and Trends in Theoretical Computer Science, 5(1):1–72, 2009.

[74]   Emanuele Viola. Reducing 3XOR to listing triangles, an exposition. Available at http://www.ccs.neu.edu/home/viola/, 2011.

[75]   Emanuele Viola. Lower bounds for data structures with space close to maximum imply circuit lower bounds. Theory of Computing, 15:1–9, 2019. Available at http://www.ccs.neu.edu/home/viola/.

[76]   Emanuele Viola. Pseudorandom bits and lower bounds for randomized turing machines. Theory of Computing, 18(10):1–12, 2022.

Mathematics of the impossible, Chapter 10, Constant-depth circuits

In this chapter we further investigate circuits of constant depth, focusing on two pervasive classes, which indeed we have already encountered under different names.

Note on notation: I here use “AC” for AltCkt a.k.a. alternating circuits. I also use AC for the class of functions f computable by an AC of depth d and size n^{d}.

10.1 Threshold circuits

Definition 10.1. A threshold circuit, abbreviated TC, is a circuit made of Majority gates (of unbounded fan-in). We also denote by TC the class of functions f computable by a TC of depth d and size n^{d} for some constant d.

TCs are one of the frontiers of our knowledge. It isn’t known how to prove impossibility results even for TCs of depth 3 and size, say, n^{2}.

Exercise 10.1. Prove that \text {AC}\subseteq \text {TC}\subseteq \text {NC}^{1}.

Exercise 10.2. A function f:\{0,1\} ^{*}\to \{0,1\} is symmetric if it only depends on the weight of the input. Prove that any symmetric function is in \text {TC}.

The result \text {PH}\text {\ensuremath {\subseteq \text {Maj}\cdot \text {Maj}\cdot \text {P}}} obtained in 6.13 in particular yields the following.

Theorem 10.1. [5]Any function f in AC has TCs of depth 3 and size 2^{\log ^{c_{f}}n}.

Theorem 10.2. The following problems are in TC:

  1. Addition of two input integers.
  2. Iterated addition: Addition of any number of input integers.
  3. Multiplication of two input integers.
  4. Iterated multiplication: Multiplication of any number of input integers.
  5. Division of two integers.

The proof follows closely that for NC^{1} in section �9.1 (which in turn was based on that for L). Only iterated addition requires a new idea.

Exercise 10.3. Prove the claim about iterated addition. (Hint: Write input as n\times n matrix, one number per row. Divide columns into blocks of t=c\log n.)

10.2 TC vs. NC^{1}

Another great question is whether \text {TC}=\text {NC}^{1}. For any d, we can show that functions in \text {NC}^{1}, such as Parity, require depth-d TCs of size \ge n^{1+c\log d}, and this is tight up to constants.[34] A natural question is whether we can prove stronger bounds for harder functions in \text {NC}^{1}. A natural candidate is iterated multiplication of 3×3 matrices. The following result shows that, in fact, stronger bounds would already prove “the whole thing,” that is, \text {TC}\ne \text {NC}^{1}.

Theorem 10.3. [717] Let G be the set of 3×3 matrices of \mathbb {F}_{2}. Suppose that the product of n elements in G can be computed by TCs of size n^{k} and depth d. Then for any \epsilon the product can also be computed by TCs of size d'n^{1+\epsilon } and depth d':=cdk\log 1/\epsilon .

The same result applies to any constant-size group G – we state it for matrices for concreteness.

ProofExploiting the associativity of the problem, we compute the product recursively according to a regular tree. The root is defined to have level 0. At Level i we compute n_{i} products of (n^{1+\epsilon }/n_{i})^{1/k} matrices. At the root (i=0) we have n_{0}=1.

By the assumption, each product at Level i has TCs of size n^{1+\epsilon }/n_{i} and depth d. Hence Level i can be computed by TCs of size n^{1+\epsilon } and depth d.

We have the recursion

\begin{aligned} n_{i+1}=n_{i}\cdot (n^{1+\epsilon }/n_{i})^{1/k}. \end{aligned}

The solution to this recursion is n_{i}=n^{(1+\epsilon )(1-(1-1/k)^{i})}, see below.

For i=ck\log (1/\epsilon ) we have n_{i}=n^{(1+\epsilon )(1-\epsilon ^{2})}>n; this means that we can compute a product of \ge n matrices, as required.

Hence the total depth of the circuit is d\cdot ck\log (1/\epsilon ), and the total size is the depth times n^{1+\epsilon }.

It remains to solve the recurrence. Letting a_{i}:=\log _{n}n_{i} we have the following recurrence for the exponents of n_{i}.

\begin{aligned} a_{0} & =0\\ a_{i+1} & =a_{i}(1-1/k)+(1+\epsilon )/k=a_{i}\alpha +\gamma \end{aligned}

where \alpha :=(1-1/k) and \gamma :=(1+\epsilon )/k.

This gives

\begin{aligned} a_{i}=\gamma \sum _{j\le i}\alpha {}^{j}=\gamma \frac {1-\alpha ^{i+1}}{1-\alpha }=(1+\epsilon )(1-\alpha ^{i+1}). \end{aligned}

QED

Were the recursion of the form a'_{i+1}=a'_{i}+(1+\epsilon )/k then obviously a'_{ck} would already be \ge 1+\epsilon . Instead for a_{i} we need to get to ck\log (1/\epsilon ).

10.3 Impossibility results for AC

In this section we prove impossibility results for ACs, matching several settings of parameters mentioned earlier (cf. section �7.3).

To set the stage, let’s prove strong results for depth 2, that is, DNFs.

Exercise 10.4. Prove that Majority requires DNFs of size \ge 2^{cn}. Hint: What if you have a term with less than n variables?

As discussed, 2^{cn} bounds even for depth 3 ACs are unknown, and would imply major separations. The following is close to the state-of-the-art for depth d.

Theorem 10.4. [5263] Let C be an AC of depth d and size s computing Majority on n bits. Then \log ^{d}s\ge c\sqrt {n}.

Recall from section �7.3 that a stronger bound for an explicit function would have major consequences; in particular the function cannot be in L.

The proof uses the simulation of circuits by low-degree polynomials which we saw in Theorem 6.5. Specifically, we use the following corollary:

Corollary 10.1. Let C:\{0,1\} ^{n}\to \{0,1\} be an alternating circuit of depth d and size s. Then there is a polynomial p over \mathbb {F}_{2} of degree \log ^{d}s/\epsilon such that \mathbb {P}_{x}[C(x)\ne p(x)]\le \epsilon .

ProofTheorem 6.5 gave a distribution P on polynomials s.t. for every x we have

\begin{aligned} \mathbb {P}_{P}[C(x)\ne P(x)]\le \epsilon . \end{aligned}

Averaging over x we also have

\begin{aligned} \mathbb {P}_{x,P}[C(x)\ne P(x)]\le \epsilon . \end{aligned}

Hence we can fix a particular polynomial p s.t. the probability over x is \le \epsilon , yielding the result. QED

We then show that Majority cannot be approximated by such low-degree polynomials.

The key result is the following:

Lemma 10.1. Every function f:\{0,1\}^n \to \{0,1\} can be written as f(x)=p_{0}(x)+p_{1}(x)\cdot \text {Maj}(x), for some polynomials p_{0} and p_{1} of degree \le n/2. This holds for every odd n.

ProofLet M_{0} be the set of strings with weight \le n/2. We claim that for every function f:M_{0}\to \{0,1\} there is a polynomial p_{0} of degree \le n/2 s.t. p_{0} and f agree on M_{0}.

To verify this, consider the monomials of degree \le n/2. We claim that (the vectors corresponding to) their truth tables over M_{0} are linearly independent. This means that any polynomial gives a different function over M_{0}, and because the number of polynomials is the same as the number of functions, the result follows. QED

Exercise 10.5. Prove the claim in the proof.

Proof of Theorem 10.4 Apply Corollary 10.1 with \epsilon =1/10 to obtain p. Let S be the set of inputs on which p(x)=C(x). By Lemma 10.1, any function f:S\to \{0,1\} ca be written as

\begin{aligned} f(x)=p_{0}(x)+p_{1}(x)\cdot p(x). \end{aligned}

The right-hand size is a polynomial of degree \le d':=n/2+\log ^{d}(cs). The number of such polynomials is the number of possible choices for each monomial of degree i, for any i up to the degree. This number is

\begin{aligned} \prod _{i=0}^{d'}2^{{n \choose i}}=2^{\sum _{i}^{d'}{n \choose i}}. \end{aligned}

On the other hand, the number of possible functions f:S\to \{0,1\} is

\begin{aligned} 2^{|S|}. \end{aligned}

Since a polynomial computes at most one function, taking logs we have

\begin{aligned} |S|\le \sum _{i}^{d'}{n \choose i}. \end{aligned}

The right-hand side is at most 2^{n}(1/2+c\log ^{d}(s)/\sqrt {n}), since each binomial coefficient is \le c2^{n}/\sqrt {n}.

On the other hand, |S|\ge 0.9\cdot 2^{n}.

Combining this we get

\begin{aligned} 0.9\cdot 2^{n}\le 2^{n}(1/2+c\log ^{d}(s)/\sqrt {n}). \end{aligned}

This implies

\begin{aligned} 0.4\le c\log ^{d}(s)/\sqrt {n}, \end{aligned}

proving the theorem. QED

Exercise 10.6. Explain why Theorem 10.4 holds even if the circuits have Parity gates (in addition to Or and And gates).

[1]   Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Tight hardness results for LCS and other sequence similarity measures. In Venkatesan Guruswami, editor, IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pages 59–78. IEEE Computer Society, 2015.

[2]   Leonard Adleman. Two theorems on random polynomial time. In 19th IEEE Symp. on Foundations of Computer Science (FOCS), pages 75–83. 1978.

[3]   Mikl�s Ajtai. \Sigma \sp {1}\sb {1}-formulae on finite structures. Annals of Pure and Applied Logic, 24(1):1–48, 1983.

[4]   Mikl�s Ajtai. A non-linear time lower bound for boolean branching programs. Theory of Computing, 1(1):149–176, 2005.

[5]   Eric Allender. A note on the power of threshold circuits. In 30th Symposium on Foundations of Computer Science, pages 580–584, Research Triangle Park, North Carolina, 30 October–1 November 1989. IEEE.

[6]   Eric Allender. The division breakthroughs. Bulletin of the EATCS, 74:61–77, 2001.

[7]   Eric Allender and Michal Koucký. Amplifying lower bounds by means of self-reducibility. J. of the ACM, 57(3), 2010.

[8]   Noga Alon, Oded Goldreich, Johan H�stad, and Ren� Peralta. Simple constructions of almost k-wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992.

[9]   Dana Angluin and Leslie G. Valiant. Fast probabilistic algorithms for hamiltonian circuits and matchings. J. Comput. Syst. Sci., 18(2):155–193, 1979.

[10]   Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. of the ACM, 45(3):501–555, May 1998.

[11]   Arturs Backurs and Piotr Indyk. Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). SIAM J. Comput., 47(3):1087–1097, 2018.

[12]   Kenneth E. Batcher. Sorting networks and their applications. In AFIPS Spring Joint Computing Conference, volume 32, pages 307–314, 1968.

[13]   Paul Beame, Stephen A. Cook, and H. James Hoover. Log depth circuits for division and related problems. SIAM J. Comput., 15(4):994–1003, 1986.

[14]   Paul Beame, Michael Saks, Xiaodong Sun, and Erik Vee. Time-space trade-off lower bounds for randomized computation of decision problems. J. of the ACM, 50(2):154–195, 2003.

[15]   Michael Ben-Or and Richard Cleve. Computing algebraic formulas using a constant number of registers. SIAM J. on Computing, 21(1):54–58, 1992.

[16]   Samuel R. Buss and Ryan Williams. Limits on alternation trading proofs for time-space lower bounds. Comput. Complex., 24(3):533–600, 2015.

[17]   Lijie Chen and Roei Tell. Bootstrapping results for threshold circuits “just beyond” known lower bounds. In Moses Charikar and Edith Cohen, editors, Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23-26, 2019, pages 34–41. ACM, 2019.

[18]   Richard Cleve. Towards optimal simulations of formulas by bounded-width programs. Computational Complexity, 1:91–105, 1991.

[19]   Stephen A. Cook. A hierarchy for nondeterministic time complexity. J. of Computer and System Sciences, 7(4):343–353, 1973.

[20]   L. Csanky. Fast parallel matrix inversion algorithms. SIAM J. Comput., 5(4):618–623, 1976.

[21]   Lance Fortnow. Time-space tradeoffs for satisfiability. J. Comput. Syst. Sci., 60(2):337–353, 2000.

[22]   Aviezri S. Fraenkel and David Lichtenstein. Computing a perfect strategy for n x n chess requires time exponential in n. J. Comb. Theory, Ser. A, 31(2):199–214, 1981.

[23]   Michael L. Fredman and Michael E. Saks. The cell probe complexity of dynamic data structures. In ACM Symp. on the Theory of Computing (STOC), pages 345–354, 1989.

[24]   Anka Gajentaan and Mark H. Overmars. On a class of {O}(n^2) problems in computational geometry. Comput. Geom., 5:165–185, 1995.

[25]   M. R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.

[26]   K. G�del. �ber formal unentscheidbare s�tze der Principia Mathematica und verwandter systeme I. Monatsh. Math. Phys., 38, 1931.

[27]   Oded Goldreich. Computational Complexity: A Conceptual Perspective. Cambridge University Press, 2008.

[28]   Raymond Greenlaw, H. James Hoover, and Walter Ruzzo. Limits to Parallel Computation: P-Completeness Theory. 02 2001.

[29]   David Harvey and Joris van der Hoeven. Integer multiplication in time O(n\mathrm {log}\, n). Annals of Mathematics, 193(2):563 – 617, 2021.

[30]   F. C. Hennie. One-tape, off-line turing machine computations. Information and Control, 8(6):553–578, 1965.

[31]   Fred Hennie and Richard Stearns. Two-tape simulation of multitape turing machines. J. of the ACM, 13:533–546, October 1966.

[32]   John E. Hopcroft, Wolfgang J. Paul, and Leslie G. Valiant. On time versus space. J. ACM, 24(2):332–337, 1977.

[33]   Russell Impagliazzo and Ramamohan Paturi. The complexity of k-sat. In IEEE Conf. on Computational Complexity (CCC), pages 237–, 1999.

[34]   Russell Impagliazzo, Ramamohan Paturi, and Michael E. Saks. Size-depth tradeoffs for threshold circuits. SIAM J. Comput., 26(3):693–707, 1997.

[35]   Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which problems have strongly exponential complexity? J. Computer & Systems Sciences, 63(4):512–530, Dec 2001.

[36]   Russell Impagliazzo and Avi Wigderson. \mathit {P} = \mathit {BPP} if E requires exponential circuits: Derandomizing the XOR lemma. In 29th ACM Symp. on the Theory of Computing (STOC), pages 220–229. ACM, 1997.

[37]   Richard M. Karp and Richard J. Lipton. Turing machines that take advice. L’Enseignement Math�matique. Revue Internationale. IIe S�rie, 28(3-4):191–209, 1982.

[38]   Kojiro Kobayashi. On the structure of one-tape nondeterministic turing machine time hierarchy. Theor. Comput. Sci., 40:175–193, 1985.

[39]   Kasper Green Larsen, Omri Weinstein, and Huacheng Yu. Crossing the logarithmic barrier for dynamic boolean data structure lower bounds. SIAM J. Comput., 49(5), 2020.

[40]   Leonid A. Levin. Universal sequential search problems. Problemy Peredachi Informatsii, 9(3):115–116, 1973.

[41]   Carsten Lund, Lance Fortnow, Howard Karloff, and Noam Nisan. Algebraic methods for interactive proof systems. J. of the ACM, 39(4):859–868, October 1992.

[42]   O. B. Lupanov. A method of circuit synthesis. Izv. VUZ Radiofiz., 1:120–140, 1958.

[43]   Wolfgang Maass and Amir Schorr. Speed-up of Turing machines with one work tape and a two-way input tape. SIAM J. on Computing, 16(1):195–202, 1987.

[44]   David A. Mix Barrington. Bounded-width polynomial-size branching programs recognize exactly those languages in NC^1. J. of Computer and System Sciences, 38(1):150–164, 1989.

[45]   Joseph Naor and Moni Naor. Small-bias probability spaces: efficient constructions and applications. SIAM J. on Computing, 22(4):838–856, 1993.

[46]   E. I. Nechiporuk. A boolean function. Soviet Mathematics-Doklady, 169(4):765–766, 1966.

[47]   Valery A. Nepomnjaščiĭ. Rudimentary predicates and Turing calculations. Soviet Mathematics-Doklady, 11(6):1462–1465, 1970.

[48]   NEU. From RAM to SAT. Available at http://www.ccs.neu.edu/home/viola/, 2012.

[49]   Christos H. Papadimitriou and Mihalis Yannakakis. Optimization, approximation, and complexity classes. J. Comput. Syst. Sci., 43(3):425–440, 1991.

[50]   Wolfgang J. Paul, Nicholas Pippenger, Endre Szemer�di, and William T. Trotter. On determinism versus non-determinism and related problems (preliminary version). In IEEE Symp. on Foundations of Computer Science (FOCS), pages 429–438, 1983.

[51]   Nicholas Pippenger and Michael J. Fischer. Relations among complexity measures. J. of the ACM, 26(2):361–381, 1979.

[52]   Alexander Razborov. Lower bounds on the dimension of schemes of bounded depth in a complete basis containing the logical addition function. Akademiya Nauk SSSR. Matematicheskie Zametki, 41(4):598–607, 1987. English translation in Mathematical Notes of the Academy of Sci. of the USSR, 41(4):333-338, 1987.

[53]   Omer Reingold. Undirected connectivity in log-space. J. of the ACM, 55(4), 2008.

[54]   J. M. Robson. N by N checkers is exptime complete. SIAM J. Comput., 13(2):252–267, 1984.

[55]   Rahul Santhanam. On separators, segregators and time versus space. In IEEE Conf. on Computational Complexity (CCC), pages 286–294, 2001.

[56]   Walter J. Savitch. Relationships between nondeterministic and deterministic tape complexities. Journal of Computer and System Sciences, 4(2):177–192, 1970.

[57]   Arnold Sch�nhage. Storage modification machines. SIAM J. Comput., 9(3):490–508, 1980.

[58]   Adi Shamir. IP = PSPACE. J. of the ACM, 39(4):869–877, October 1992.

[59]   Claude E. Shannon. The synthesis of two-terminal switching circuits. Bell System Tech. J., 28:59–98, 1949.

[60]   Victor Shoup. New algorithms for finding irreducible polynomials over finite fields. Mathematics of Computation, 54(189):435–447, 1990.

[61]   Alan Siegel. On universal classes of extremely random constant-time hash functions. SIAM J. on Computing, 33(3):505–543, 2004.

[62]   Michael Sipser. A complexity theoretic approach to randomness. In ACM Symp. on the Theory of Computing (STOC), pages 330–335, 1983.

[63]   Roman Smolensky. Algebraic methods in the theory of lower bounds for Boolean circuit complexity. In 19th ACM Symp. on the Theory of Computing (STOC), pages 77–82. ACM, 1987.

[64]   Larry Stockmeyer and Albert R. Meyer. Cosmological lower bound on the circuit complexity of a small problem in logic. J. ACM, 49(6):753–784, 2002.

[65]   Seinosuke Toda. PP is as hard as the polynomial-time hierarchy. SIAM J. on Computing, 20(5):865–877, 1991.

[66]   Alan M. Turing. On computable numbers, with an application to the entscheidungsproblem. Proc. London Math. Soc., s2-42(1):230–265, 1937.

[67]   Leslie G. Valiant. Graph-theoretic arguments in low-level complexity. In 6th Symposium on Mathematical Foundations of Computer Science, volume 53 of Lecture Notes in Computer Science, pages 162–176. Springer, 1977.

[68]   Leslie G. Valiant and Vijay V. Vazirani. NP is as easy as detecting unique solutions. Theor. Comput. Sci., 47(3):85–93, 1986.

[69]   Dieter van Melkebeek. A survey of lower bounds for satisfiability and related problems. Foundations and Trends in Theoretical Computer Science, 2(3):197–303, 2006.

[70]   Dieter van Melkebeek and Ran Raz. A time lower bound for satisfiability. Theor. Comput. Sci., 348(2-3):311–320, 2005.

[71]   N. V. Vinodchandran. A note on the circuit complexity of PP. Theor. Comput. Sci., 347(1-2):415–418, 2005.

[72]   Emanuele Viola. On approximate majority and probabilistic time. Computational Complexity, 18(3):337–375, 2009.

[73]   Emanuele Viola. On the power of small-depth computation. Foundations and Trends in Theoretical Computer Science, 5(1):1–72, 2009.

[74]   Emanuele Viola. Reducing 3XOR to listing triangles, an exposition. Available at http://www.ccs.neu.edu/home/viola/, 2011.

[75]   Emanuele Viola. Lower bounds for data structures with space close to maximum imply circuit lower bounds. Theory of Computing, 15:1–9, 2019. Available at http://www.ccs.neu.edu/home/viola/.

[76]   Emanuele Viola. Pseudorandom bits and lower bounds for randomized turing machines. Theory of Computing, 18(10):1–12, 2022.

Mathematics of the impossible, Chapter 9, Log-depth circuits

In this chapter we investigate circuits of logarithmic depth. Again, several surprises lay ahead, including a solution to the teaser in Chapter 1!

Let us begin slowly with some basic properties of these circuits so as to get familiar with them. The next exercise shows that circuits of depth d\log n for a constant d also have power size, so we don’t need to bound the size separately.

Exercise 9.1. A circuit of depth d has size \le c^{d} without loss of generality.

The next exercises shows how to compute several simple functions by log-depth circuits.

Exercise 9.2. Prove that the Or, And, and Parity functions on n bits have circuits of depth c\log n.

Prove that any f: \{0,1\}^n \to \{0,1\} computable by an AltCkt of depth d and size s\ge n is also computable by a circuit of depth cd\log s and size s^{c}.

Next, let us relate these circuits to branching programs. The upshot is that circuits of logarithmic depth are a special case of power-size branching programs, and the latter are a special case of circuits of log-square depth.

Theorem 9.1. Directed reachability has circuits of depth c\log ^{2}n and size n^{c}. In particular, the same holds for any function in NL, and any function with power-size branching programs.

ProofOn input a graph G on u nodes and two nodes s and t, let M be the u\times u transition matrix corresponding to G, where M_{i,j}=1 iff edge j\to i is in G.

Transition matrices are multiplied as normal matrices, except that “+” is replaced with “\vee ,” which suffices to know connectivity. To answer directed reachability we compute entry t of M^{u}v, where v has a 1 corresponding to s and 0 everywhere else. (We can modify the graph to add a self-loop on node t so that we can reach t in exactly u steps iff we reach t in any number of steps.)

Computing M^{u} can be done by squaring c\log u times M. Each squaring can be done in depth c\log u, by Exercise 9.2. This establishes the first claim, since u\le n.

The “in particular” follows because those functions can be reduced to directed reachability efficiently. QED

Conversely, we have the following.

Theorem 9.2. Any function f: \{0,1\}^n \to \{0,1\} computed by a circuit of depth d can be computed by a branching program of size 2^{d}.

In particular, functions computed by circuits of logarithmic depth can be computed by branching programs of power size.

Later in this chapter we will prove a stronger and much less obvious result.

ProofWe proceed by induction on the depth of the circuit C. If the depth is 1 then C is either a constant or an input bit, and a branching program of size 1 is available by definition.

Suppose the circuit C has the form C_{1}\wedge C_{2}. By induction, C_{1} and C_{2} have branching programs B_{1}and B_{2} each of size 2^{d-1}. A branching program B for C of size 2^{d} is obtained by rewiring the edges leading to states labelled 1 in B_{1} to the start state of B_{2}. The start state of B is the start state of B_{1}. QED

Exercise 9.3. Finish the proof by analyzing the case C=C_{1}\vee C_{2}.

Definition 9.1. \text {NC}^{i} is the class of functions f:\{0,1\}^* \to \{0,1\}^* computable by circuits that have depth a\log ^{i}n and size n^{a}, for some constant a. The circuits are uniform if they can be computed in \text {L}.

The class \text {NC}^{0} is also of great interest. It can be more simply defined as the class of functions where each output bit depends on a constant number of input bits. We will see many surprising useful things that can be computed in this class.

The previous results give, for uniform circuits:

\begin{aligned} \text {NC}^{0}\subseteq \text {\ensuremath {\text {NC}^{1}\subseteq }}\text {L}\subseteq \text {NL}\subseteq \text {NC}^{2}\subseteq \text {NC}^{3}\subseteq \cdots \subseteq \text {P}. \end{aligned}

The only inclusion known to be strict is the first one:

Exercise 9.4. Prove that \text {NC}^{0}\ne \text {NC}^{1}. (Mostly to practice definitions.)

9.1 The power of \text {NC}^{1}: Arithmetic

In this section we illustrate the power of \text {NC}^{1} by showing that the same basic arithmetic which we saw is doable in \text {L} (Theorem 7.5) can in fact be done in \text {NC}^{1} as well.

Theorem 9.3. The following problems are in NC^{1}:

  1. Addition of two input integers.
  2. Iterated addition: Addition of any number of input integers.
  3. Multiplication of two input integers.
  4. Iterated multiplication: Multiplication of any number of input integers.
  5. Division of two integers.

Exercise 9.5. Prove Item 1. in Theorem 9.3.

Iterated addition is surprisingly non-trivial. We can’t use the methods from the proof of Theorem 7.5. Instead, we rely on a new and very clever technique.

Proof of Item 2. in Theorem 9.3. We use “2-out-of-3:” Given 3 integers X,Y,Z, we compute 2 integers A,B such that

\begin{aligned} X+Y+Z=A+B, \end{aligned}

where each bit of A and B only depends on three bits, one from X, one from Y, and one from Z. Thus A and B can be computed in NC^{0}.

If we can do this, then to compute iterated addition we construct a tree of logarithmic depth to reduce the original sum to a sum 2 terms, which we add as in Item 1.

Here’s how it works. Note X_{i}+Y_{i}+Z_{i}\leq 3. We let A_{i} be the least significant bit of this sum, and B_{i+1} the most significant one. Note that A_{i} is the XOR X_{i}+Y_{i}+Z_{i}, while B_{i+1} is the majority of X_{i},Y_{i},Z_{i}. QED

The following corollary will also be used to solve the teaser in Chapter 1.

Corollary 9.1. Majority is in \text {NC}^{1}.

Exercise 9.6. Prove it.

Exercise 9.7. Prove Item 3. in Theorem 9.3.

Next we turn to iterated multiplication. The idea is to follow the proof for L in section 7.2.1. We shall use CRR again. The problem is that we still had to perform iterated multiplication, albeit only in \mathbb {Z}_{p} for p\le n^{c}. One more mathematical result is useful now:

Theorem 9.4. If p is a prime then (\mathbb {Z}_{p}-\{0\}) is a cyclic group, meaning that there exists a generator g\in (\mathbb {Z}_{p}-\{0\}):\forall x\in (\mathbb {Z}_{p}-\{0\}),x=g^{i}, for some i\in \mathbb {Z}.

Example 9.1. For p=5 we can take g=2: 2^{0}=1,2^{1}=2,2^{2}=4,2^{3}=8=3.

Proof of Item 4. in Theorem 9.3 We follow the proof for L in section 7.2.1. To compute iterated product of integers r_{1},r_{2},\ldots ,r_{t} modulo p, use Theorem 9.4 to compute exponents e_{1},e_{2},\ldots ,e_{t} s.t. 

\begin{aligned} r_{i}=g^{e_{i}}. \end{aligned}

Then \prod _{i}r_{i}\mod p=g^{\sum _{i}e_{i}}. We can use Item 2. to compute the iterated addition of the exponents. Note that computing the exponent of a number mod p, and vice versa, can be done in log-depth since the numbers have c\log n bits (as follows for example by combining Theorem 2.3 and Exercise 9.2). QED

One can also compute division, and make all these circuits uniform, but we won’t prove this now.

9.2 Computing with 3 bits of memory

We now move to a surprising result that in particular strengthens Theorem 9.2. For a moment, let’s forget about circuits, branching programs, etc. and instead consider a new, minimalistic type of programs. We will have 3 one-bit registers: R_{0},R_{1},R_{2}, operating modulo 2. We allow the following operations

\begin{aligned} R_{i} & +=R_{j},\\ R_{i} & +=R_{j}x_{k} \end{aligned}

where x_{k} is an input bit, for any i,j\in \{0,1,2\}, with i\ne j. (Talk about RISC!) Here R_{i}+=R_{j} means to add the content of R_{j} to R_{i}, while R_{i}+=R_{j}x_{k} means to add R_{j}x_{k} to R_{i}, where R_{j}x_{k} is the product (a.k.a. And) of R_{j} and x_{k}.

Definition 9.2. For i,j and f: \{0,1\}^n \to \{0,1\} we say that a program is for (or equivalent to)

\begin{aligned} R_{i}+=R_{j}f \end{aligned}

if for every input x and initial values of the registers, executing the program is equivalent to the instruction R_{i}+=R_{j}f(x), where note that R_{j} and R_{k} are unchanged.

Also note that if we repeat twice a program for R_{i}+=R_{j}f then no register changes (recall the sum is modulo 2, so 1+1=0). This feature is critically exploited later to “clean up” computation.

We now state and prove the surprising result. It is convenient to state it for circuits with Xor instead of Or gates. This is without loss of generality since x\vee y=x+y+x\wedge y.

Theorem 9.5. [4415] Suppose f: \{0,1\}^n \to \{0,1\} is computable by circuits of depth d with Xor and And gates. For every i\ne j there is a program of length \le c4^{d} for

\begin{aligned} R_{i} & +=R_{j}f. \end{aligned}

Once such a program is available, we can start with register values (0,1,0) and i=0,j=1 to obtain f(x) in R_{0}.

ProofWe proceed by induction on d. When d=1 the circuit is simply outputting a constant or one of the input bits, which we can compute with the corresponding instructions. (If the circuit is the constant zero then the empty program would do.)

Proceeding with the induction step:

A program for R_{i}+=R_{j}(f_{1}+f_{2}) is simply given by the concatenation of (the programs for)

\begin{aligned} R_{i} & +=R_{j}f_{1}\\ R_{i} & +=R_{j}f_{2}. \end{aligned}

Less obviously, a program for R_{i}+=R_{j}(f_{1}\wedge f_{2}) is given by

\begin{aligned} R_{i} & +=R_{k}f_{1}\\ R_{k} & +=R_{j}f_{2}\\ R_{i} & +=R_{k}f_{1}\\ R_{k} & +=R_{j}f_{2}. \end{aligned}

QED

Exercise 9.8. Prove that the program for f_{1}\wedge f_{2} in the proof works. Write down the contents of the registers after each instruction in the program.

A similar proof works over other fields as well.

We can now address the teaser Theorem 1.1 from Chapter 1.

Proof of Theorem 1.1. Combine Corollary 9.1 with Theorem 9.5. QED

Corollary 9.2. Iterated product of 3×3 matrices over \mathbb {F}_{2} is complete for NC^{1} under projections.

That is, the problem is in NC^{1} and for any f\in \text {NC}^{1} and n one can write a sequence of t=n^{c} 3×3 matrices M_{1},M_{2},\ldots ,M_{t} where each entry is either a constant or an input variable x_{i} s.t. for every x\in \{0,1\} ^{n}:

\begin{aligned} \prod _{i=1}^{t}M_{i}\cdot \begin {bmatrix}0\\ 1\\ 0 \end {bmatrix}=\begin {bmatrix}f(x)\\ 1\\ 0 \end {bmatrix}. \end{aligned}

Exercise 9.9. Prove this.

9.3 Linear-size log-depth

It is unknown whether NP has linear-size circuits of logarithmic depth! But there is a non-trivial simulation of such circuits by AltCkts of depth 3 of sub-exponential size.

Theorem 9.6. [67] Any circuit C: \{0,1\}^n \to \{0,1\} of size an and depth a\log n has an equivalent alternating circuit of depth 3 and size 2^{c_{a}n/\log \log n}.

The idea is… yes! Once again, we are going to guess computation. It is possible to guess the values of about n/\log \log n wires to reduce the depth to say 0.1\log n, and the rest is brute-forced. For an exposition, see [73].

Notes

Our presentation of Theorem 9.5 follows [18].

[1]   Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Tight hardness results for LCS and other sequence similarity measures. In Venkatesan Guruswami, editor, IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pages 59–78. IEEE Computer Society, 2015.

[2]   Leonard Adleman. Two theorems on random polynomial time. In 19th IEEE Symp. on Foundations of Computer Science (FOCS), pages 75–83. 1978.

[3]   Mikl�s Ajtai. \Sigma \sp {1}\sb {1}-formulae on finite structures. Annals of Pure and Applied Logic, 24(1):1–48, 1983.

[4]   Mikl�s Ajtai. A non-linear time lower bound for boolean branching programs. Theory of Computing, 1(1):149–176, 2005.

[5]   Eric Allender. A note on the power of threshold circuits. In 30th Symposium on Foundations of Computer Science, pages 580–584, Research Triangle Park, North Carolina, 30 October–1 November 1989. IEEE.

[6]   Eric Allender. The division breakthroughs. Bulletin of the EATCS, 74:61–77, 2001.

[7]   Eric Allender and Michal Koucký. Amplifying lower bounds by means of self-reducibility. J. of the ACM, 57(3), 2010.

[8]   Noga Alon, Oded Goldreich, Johan H�stad, and Ren� Peralta. Simple constructions of almost k-wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992.

[9]   Dana Angluin and Leslie G. Valiant. Fast probabilistic algorithms for hamiltonian circuits and matchings. J. Comput. Syst. Sci., 18(2):155–193, 1979.

[10]   Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. of the ACM, 45(3):501–555, May 1998.

[11]   Arturs Backurs and Piotr Indyk. Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). SIAM J. Comput., 47(3):1087–1097, 2018.

[12]   Kenneth E. Batcher. Sorting networks and their applications. In AFIPS Spring Joint Computing Conference, volume 32, pages 307–314, 1968.

[13]   Paul Beame, Stephen A. Cook, and H. James Hoover. Log depth circuits for division and related problems. SIAM J. Comput., 15(4):994–1003, 1986.

[14]   Paul Beame, Michael Saks, Xiaodong Sun, and Erik Vee. Time-space trade-off lower bounds for randomized computation of decision problems. J. of the ACM, 50(2):154–195, 2003.

[15]   Michael Ben-Or and Richard Cleve. Computing algebraic formulas using a constant number of registers. SIAM J. on Computing, 21(1):54–58, 1992.

[16]   Samuel R. Buss and Ryan Williams. Limits on alternation trading proofs for time-space lower bounds. Comput. Complex., 24(3):533–600, 2015.

[17]   Lijie Chen and Roei Tell. Bootstrapping results for threshold circuits “just beyond” known lower bounds. In Moses Charikar and Edith Cohen, editors, Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23-26, 2019, pages 34–41. ACM, 2019.

[18]   Richard Cleve. Towards optimal simulations of formulas by bounded-width programs. Computational Complexity, 1:91–105, 1991.

[19]   Stephen A. Cook. A hierarchy for nondeterministic time complexity. J. of Computer and System Sciences, 7(4):343–353, 1973.

[20]   L. Csanky. Fast parallel matrix inversion algorithms. SIAM J. Comput., 5(4):618–623, 1976.

[21]   Lance Fortnow. Time-space tradeoffs for satisfiability. J. Comput. Syst. Sci., 60(2):337–353, 2000.

[22]   Aviezri S. Fraenkel and David Lichtenstein. Computing a perfect strategy for n x n chess requires time exponential in n. J. Comb. Theory, Ser. A, 31(2):199–214, 1981.

[23]   Michael L. Fredman and Michael E. Saks. The cell probe complexity of dynamic data structures. In ACM Symp. on the Theory of Computing (STOC), pages 345–354, 1989.

[24]   Anka Gajentaan and Mark H. Overmars. On a class of {O}(n^2) problems in computational geometry. Comput. Geom., 5:165–185, 1995.

[25]   M. R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.

[26]   K. G�del. �ber formal unentscheidbare s�tze der Principia Mathematica und verwandter systeme I. Monatsh. Math. Phys., 38, 1931.

[27]   Oded Goldreich. Computational Complexity: A Conceptual Perspective. Cambridge University Press, 2008.

[28]   Raymond Greenlaw, H. James Hoover, and Walter Ruzzo. Limits to Parallel Computation: P-Completeness Theory. 02 2001.

[29]   David Harvey and Joris van der Hoeven. Integer multiplication in time O(n\mathrm {log}\, n). Annals of Mathematics, 193(2):563 – 617, 2021.

[30]   F. C. Hennie. One-tape, off-line turing machine computations. Information and Control, 8(6):553–578, 1965.

[31]   Fred Hennie and Richard Stearns. Two-tape simulation of multitape turing machines. J. of the ACM, 13:533–546, October 1966.

[32]   John E. Hopcroft, Wolfgang J. Paul, and Leslie G. Valiant. On time versus space. J. ACM, 24(2):332–337, 1977.

[33]   Russell Impagliazzo and Ramamohan Paturi. The complexity of k-sat. In IEEE Conf. on Computational Complexity (CCC), pages 237–, 1999.

[34]   Russell Impagliazzo, Ramamohan Paturi, and Michael E. Saks. Size-depth tradeoffs for threshold circuits. SIAM J. Comput., 26(3):693–707, 1997.

[35]   Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which problems have strongly exponential complexity? J. Computer & Systems Sciences, 63(4):512–530, Dec 2001.

[36]   Russell Impagliazzo and Avi Wigderson. \mathit {P} = \mathit {BPP} if E requires exponential circuits: Derandomizing the XOR lemma. In 29th ACM Symp. on the Theory of Computing (STOC), pages 220–229. ACM, 1997.

[37]   Richard M. Karp and Richard J. Lipton. Turing machines that take advice. L’Enseignement Math�matique. Revue Internationale. IIe S�rie, 28(3-4):191–209, 1982.

[38]   Kojiro Kobayashi. On the structure of one-tape nondeterministic turing machine time hierarchy. Theor. Comput. Sci., 40:175–193, 1985.

[39]   Kasper Green Larsen, Omri Weinstein, and Huacheng Yu. Crossing the logarithmic barrier for dynamic boolean data structure lower bounds. SIAM J. Comput., 49(5), 2020.

[40]   Leonid A. Levin. Universal sequential search problems. Problemy Peredachi Informatsii, 9(3):115–116, 1973.

[41]   Carsten Lund, Lance Fortnow, Howard Karloff, and Noam Nisan. Algebraic methods for interactive proof systems. J. of the ACM, 39(4):859–868, October 1992.

[42]   O. B. Lupanov. A method of circuit synthesis. Izv. VUZ Radiofiz., 1:120–140, 1958.

[43]   Wolfgang Maass and Amir Schorr. Speed-up of Turing machines with one work tape and a two-way input tape. SIAM J. on Computing, 16(1):195–202, 1987.

[44]   David A. Mix Barrington. Bounded-width polynomial-size branching programs recognize exactly those languages in NC^1. J. of Computer and System Sciences, 38(1):150–164, 1989.

[45]   Joseph Naor and Moni Naor. Small-bias probability spaces: efficient constructions and applications. SIAM J. on Computing, 22(4):838–856, 1993.

[46]   E. I. Nechiporuk. A boolean function. Soviet Mathematics-Doklady, 169(4):765–766, 1966.

[47]   Valery A. Nepomnjaščiĭ. Rudimentary predicates and Turing calculations. Soviet Mathematics-Doklady, 11(6):1462–1465, 1970.

[48]   NEU. From RAM to SAT. Available at http://www.ccs.neu.edu/home/viola/, 2012.

[49]   Christos H. Papadimitriou and Mihalis Yannakakis. Optimization, approximation, and complexity classes. J. Comput. Syst. Sci., 43(3):425–440, 1991.

[50]   Wolfgang J. Paul, Nicholas Pippenger, Endre Szemer�di, and William T. Trotter. On determinism versus non-determinism and related problems (preliminary version). In IEEE Symp. on Foundations of Computer Science (FOCS), pages 429–438, 1983.

[51]   Nicholas Pippenger and Michael J. Fischer. Relations among complexity measures. J. of the ACM, 26(2):361–381, 1979.

[52]   Alexander Razborov. Lower bounds on the dimension of schemes of bounded depth in a complete basis containing the logical addition function. Akademiya Nauk SSSR. Matematicheskie Zametki, 41(4):598–607, 1987. English translation in Mathematical Notes of the Academy of Sci. of the USSR, 41(4):333-338, 1987.

[53]   Omer Reingold. Undirected connectivity in log-space. J. of the ACM, 55(4), 2008.

[54]   J. M. Robson. N by N checkers is exptime complete. SIAM J. Comput., 13(2):252–267, 1984.

[55]   Rahul Santhanam. On separators, segregators and time versus space. In IEEE Conf. on Computational Complexity (CCC), pages 286–294, 2001.

[56]   Walter J. Savitch. Relationships between nondeterministic and deterministic tape complexities. Journal of Computer and System Sciences, 4(2):177–192, 1970.

[57]   Arnold Sch�nhage. Storage modification machines. SIAM J. Comput., 9(3):490–508, 1980.

[58]   Adi Shamir. IP = PSPACE. J. of the ACM, 39(4):869–877, October 1992.

[59]   Claude E. Shannon. The synthesis of two-terminal switching circuits. Bell System Tech. J., 28:59–98, 1949.

[60]   Victor Shoup. New algorithms for finding irreducible polynomials over finite fields. Mathematics of Computation, 54(189):435–447, 1990.

[61]   Alan Siegel. On universal classes of extremely random constant-time hash functions. SIAM J. on Computing, 33(3):505–543, 2004.

[62]   Michael Sipser. A complexity theoretic approach to randomness. In ACM Symp. on the Theory of Computing (STOC), pages 330–335, 1983.

[63]   Roman Smolensky. Algebraic methods in the theory of lower bounds for Boolean circuit complexity. In 19th ACM Symp. on the Theory of Computing (STOC), pages 77–82. ACM, 1987.

[64]   Larry Stockmeyer and Albert R. Meyer. Cosmological lower bound on the circuit complexity of a small problem in logic. J. ACM, 49(6):753–784, 2002.

[65]   Seinosuke Toda. PP is as hard as the polynomial-time hierarchy. SIAM J. on Computing, 20(5):865–877, 1991.

[66]   Alan M. Turing. On computable numbers, with an application to the entscheidungsproblem. Proc. London Math. Soc., s2-42(1):230–265, 1937.

[67]   Leslie G. Valiant. Graph-theoretic arguments in low-level complexity. In 6th Symposium on Mathematical Foundations of Computer Science, volume 53 of Lecture Notes in Computer Science, pages 162–176. Springer, 1977.

[68]   Leslie G. Valiant and Vijay V. Vazirani. NP is as easy as detecting unique solutions. Theor. Comput. Sci., 47(3):85–93, 1986.

[69]   Dieter van Melkebeek. A survey of lower bounds for satisfiability and related problems. Foundations and Trends in Theoretical Computer Science, 2(3):197–303, 2006.

[70]   Dieter van Melkebeek and Ran Raz. A time lower bound for satisfiability. Theor. Comput. Sci., 348(2-3):311–320, 2005.

[71]   N. V. Vinodchandran. A note on the circuit complexity of PP. Theor. Comput. Sci., 347(1-2):415–418, 2005.

[72]   Emanuele Viola. On approximate majority and probabilistic time. Computational Complexity, 18(3):337–375, 2009.

[73]   Emanuele Viola. On the power of small-depth computation. Foundations and Trends in Theoretical Computer Science, 5(1):1–72, 2009.

[74]   Emanuele Viola. Reducing 3XOR to listing triangles, an exposition. Available at http://www.ccs.neu.edu/home/viola/, 2011.

[75]   Emanuele Viola. Lower bounds for data structures with space close to maximum imply circuit lower bounds. Theory of Computing, 15:1–9, 2019. Available at http://www.ccs.neu.edu/home/viola/.

[76]   Emanuele Viola. Pseudorandom bits and lower bounds for randomized turing machines. Theory of Computing, 18(10):1–12, 2022.

Mathematics of the impossible, Chapter 8, Three impossibility results for 3Sat

We should turn back to a traditional separation technique – diagonalization.[21]

In this chapter we put together many of the techniques we have seen to obtain several impossibility results for 3Sat. The template of all these results (and others, like those mentioned in section º5.1) is similar. All these results prove time bounds of the form t\ge n^{1+\epsilon } where \epsilon \in (0,1). One can optimize the methods to push \epsilon close to 1, but even establishing \epsilon =1 seems out of reach, and there are known barriers for current techniques [16].

8.1 Impossibility I

We begin with the following remarkable result.

Theorem 8.1. [21] Either \text {3Sat}\not \in \text {L} or \text {3Sat}\not \in \text {Time}(n^{1+\epsilon }) for some constant \epsilon .

Note that we don’t know if \text {3Sat}\in \text {L} or if \text {3Sat}\in \text {Time}(n\log ^{10}n). In particular, Theorem 8.1 implies that any algorithm for 3Sat either must use super-logarithmic space or time n^{1+c}.

ProofWe assume that what we want to prove is not true and derive the following striking contradiction with the hierarchy Theorem 3.4:

\begin{aligned} \text {Time}(n^{2}) & \subseteq \text {L}\\ & \subseteq \bigcup _{d}\Sigma _{d}\text {Time}(n)\\ & \subseteq \text {Time}(n^{1.9}). \end{aligned}

The first inclusion holds by the assumption that \text {3Sat}\in \text {L} and the fact that any function in \text {Time}(n^{2}) can be reduced to 3Sat in log-space, by Theorem 5.1 and the discussion after that.

The second inclusion is Theorem 7.11.

For the third inclusion, the assumption that \text {3Sat}\in \text {Time}(n^{1+\epsilon }) for every \epsilon implies that \text {NTime}(dn)\subseteq \text {Time}(n^{1+\epsilon }) for every d and \epsilon , by the quasi-linear-time completeness of 3Sat, Theorem 5.4. Now apply Exercise 6.2. QED

8.2 Impossibility II

We now state and prove a closely related result for TiSp. We seek to rule out algorithms for 3Sat that simultaneously use little space and time, whereas in Theorem 8.1 we even ruled out the possibility that there are two distinct algorithms, one optimizing space and the other time. The main gain is that we will be able to handle much larger space: power rather than log.

Theorem 8.2. \text {3Sat}\not \in \text {TiSp}(n^{1+c_{\epsilon }},n^{1-\epsilon }), for any \epsilon >0.

The important aspect of Theorem 8.1 is that it applies to the RAM model; stronger results can be shown for space-bounded TMs.

Exercise 8.1. Prove that \text {Palindromes}\not \in \text {TM-TiSp}(n^{1+c_{\epsilon }},n^{1-\epsilon }), for any \epsilon >0. (TM-TiSp(t,s) is defined as Space(s), cf. Definition 7.1, but moreover the machine runs in at most t steps.) Hint: This problem has a simple solution. Give a suitable simulation of TM-Tisp by 1TM, then apply Theorem 3.1.

ProofWe assume that what we want to prove is not true and derive the following contradiction with the hierarchy Theorem 3.4:

\begin{aligned} \text {Time}(n^{1+\epsilon }) & \subseteq \text {\text {TiSp}(c\ensuremath {n^{(1+\epsilon )(1+c_{\epsilon })}},c\ensuremath {n^{(1+\epsilon )(1-\epsilon )}})}\\ & \subseteq \text {\text {TiSp}(\ensuremath {n^{1+c_{\epsilon }}},c\ensuremath {n^{1-\epsilon ^{2}}})}\\ & \subseteq \Sigma _{c_{\epsilon }}\text {Time}(n)\\ & \subseteq \text {Time}(n^{1+\epsilon /2}). \end{aligned}

The first inclusion holds by the assumption, padding, and the fact that 3Sat is complete under reductions s.t. each bit is computable in time (and hence space) n^{o(1)}, a fact we do not prove here. QED

Exercise 8.2. Finish the proof by justifying the remaining inclusions.

8.3 Impossibility III

So far our impossibility results required bounds on space. We now state and prove a result that applies to time. Of course, as discussed in Chapter 3, we don’t know how to prove that, say, 3Sat cannot be computed in linear time on a 2TM. For single-tape machines, we can prove quadratic bounds, for palindromes (Theorem 3.1) and 3Sat (Problem 4.2). Next we consider an interesting model which is between 1TM and 2TM and is a good indication of the state of our knowledge.

Definition 8.1. A 1.5\text {TM} is like a 2TM except that the input tape is read-only.

Theorem 8.3. [4370] 3Sat requires time n^{1+c} on a 1.5TM.

Exercise 8.3. Prove Theorem 8.3 following this guideline:

  1. Let M be a 1.5 TM running in tine t(n). Divide the read-write tape of M into consecutive blocks of b cells, shifted by an offset i \leq b. Prove that for every input there is an offset s.t. the sum of the lengths of all the crossing sequences between adjacent blocks is small.
  2. Prove that 1.5\text {TM-Time}(n^{1.1})\subseteq \exists y\in \{0,1\} ^{n^{1-c}}\text {TiSp}(n^{c},n^{1-c}). (The right-hand side is the class of functions f:\{0,1\}^* \to \{0,1\} for which there is a RAM M that on input (x,y), where |x|=n, runs in time n^{c} and uses memory cells 0..n^{1-c} and s.t. f(x)=1\Leftrightarrow \exists y\in \{0,1\} ^{n^{1-c}}M(x,y)=1.)
  3. Conclude the proof.

Notes

For a survey (not up to date) of this type of impossibility results see [69].

[1]   Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Tight hardness results for LCS and other sequence similarity measures. In Venkatesan Guruswami, editor, IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pages 59–78. IEEE Computer Society, 2015.

[2]   Leonard Adleman. Two theorems on random polynomial time. In 19th IEEE Symp. on Foundations of Computer Science (FOCS), pages 75–83. 1978.

[3]   Mikl≤s Ajtai. \Sigma \sp {1}\sb {1}-formulae on finite structures. Annals of Pure and Applied Logic, 24(1):1–48, 1983.

[4]   Mikl≤s Ajtai. A non-linear time lower bound for boolean branching programs. Theory of Computing, 1(1):149–176, 2005.

[5]   Eric Allender. A note on the power of threshold circuits. In 30th Symposium on Foundations of Computer Science, pages 580–584, Research Triangle Park, North Carolina, 30 October–1 November 1989. IEEE.

[6]   Eric Allender. The division breakthroughs. Bulletin of the EATCS, 74:61–77, 2001.

[7]   Eric Allender and Michal Koucký. Amplifying lower bounds by means of self-reducibility. J. of the ACM, 57(3), 2010.

[8]   Noga Alon, Oded Goldreich, Johan Hσstad, and RenΘ Peralta. Simple constructions of almost k-wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992.

[9]   Dana Angluin
and Leslie G. Valiant. Fast probabilistic algorithms for hamiltonian circuits and matchings. J. Comput. Syst. Sci., 18(2):155–193, 1979.

[10]   Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. of the ACM, 45(3):501–555, May 1998.

[11]   Arturs Backurs and Piotr Indyk. Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). SIAM J. Comput., 47(3):1087–1097, 2018.

[12]   Kenneth E. Batcher. Sorting networks and their applications. In AFIPS Spring Joint Computing Conference, volume 32, pages 307–314, 1968.

[13]   Paul Beame, Stephen A. Cook, and H. James Hoover. Log depth circuits for division and related problems. SIAM J. Comput., 15(4):994–1003, 1986.

[14]   Paul Beame, Michael Saks, Xiaodong Sun, and Erik Vee. Time-space trade-off lower bounds for randomized computation of decision problems. J. of the ACM, 50(2):154–195, 2003.

[15]   Michael Ben-Or and Richard Cleve. Computing algebraic formulas using a constant number of registers. SIAM J. on Computing, 21(1):54–58, 1992.

[16]   Samuel R. Buss and Ryan Williams. Limits on alternation trading proofs for time-space lower bounds. Comput. Complex., 24(3):533–600, 2015.

[17]   Lijie Chen and Roei Tell. Bootstrapping results for threshold circuits “just beyond” known lower bounds. In Moses Charikar and Edith Cohen, editors, Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23-26, 2019, pages 34–41. ACM, 2019.

[18]   Richard Cleve. Towards optimal simulations of formulas by bounded-width programs. Computational Complexity, 1:91–105, 1991.

[19]   Stephen A. Cook. A hierarchy for nondeterministic time complexity. J. of Computer and System Sciences, 7(4):343–353, 1973.

[20]   L. Csanky. Fast parallel matrix inversion algorithms. SIAM J. Comput., 5(4):618–623, 1976.

[21]   Lance Fortnow. Time-space tradeoffs for satisfiability. J. Comput. Syst. Sci., 60(2):337–353, 2000.

[22]   Aviezri S. Fraenkel and David Lichtenstein. Computing a perfect strategy for n x n chess requires time exponential in n. J. Comb. Theory, Ser. A, 31(2):199–214, 1981.

[23]   Michael L. Fredman and Michael E. Saks. The cell probe complexity of dynamic data structures. In ACM Symp. on the Theory of Computing (STOC), pages 345–354, 1989.

[24]   Anka Gajentaan and Mark H. Overmars. On a class of {O}(n^2) problems in computational geometry. Comput. Geom., 5:165–185, 1995.

[25]   M. R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.

[26]   K. G÷del. ▄ber formal unentscheidbare sΣtze der Principia Mathematica und verwandter systeme I. Monatsh. Math. Phys., 38, 1931.

[27]   Oded Goldreich. Computational Complexity: A Conceptual Perspective. Cambridge University Press, 2008.

[28]   Raymond Greenlaw, H. James Hoover, and Walter Ruzzo. Limits to Parallel Computation: P-Completeness Theory. 02 2001.

[29]   David Harvey and Joris van der Hoeven. Integer multiplication in time O(n\mathrm {log}\, n). Annals of Mathematics, 193(2):563 – 617, 2021.

[30]   F. C. Hennie. One-tape, off-line turing machine computations. Information and Control, 8(6):553–578, 1965.

[31]   Fred Hennie and Richard Stearns. Two-tape simulation of multitape turing machines. J. of the ACM, 13:533–546, October 1966.

[32]   John E. Hopcroft, Wolfgang J. Paul, and Leslie G. Valiant. On time versus space. J. ACM, 24(2):332–337, 1977.

[33]   Russell Impagliazzo and Ramamohan Paturi. The complexity of k-sat. In IEEE Conf. on Computational Complexity (CCC), pages 237–, 1999.

[34]   Russell Impagliazzo, Ramamohan Paturi, and Michael E. Saks. Size-depth tradeoffs for threshold circuits. SIAM J. Comput., 26(3):693–707, 1997.

[35]   Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which problems have strongly exponential complexity? J. Computer & Systems Sciences, 63(4):512–530, Dec 2001.

[36]   Russell Impagliazzo and Avi Wigderson. \mathit {P} = \mathit {BPP} if E requires exponential circuits: Derandomizing the XOR lemma. In 29th ACM Symp. on the Theory of Computing (STOC), pages 220–229. ACM, 1997.

[37]   Richard M. Karp and Richard J. Lipton. Turing machines that take advice. L’Enseignement MathΘmatique. Revue Internationale. IIe SΘrie, 28(3-4):191–209, 1982.

[38]   Kojiro Kobayashi. On the structure of one-tape nondeterministic turing machine time hierarchy. Theor. Comput. Sci., 40:175–193, 1985.

[39]   Kasper Green Larsen, Omri Weinstein, and Huacheng Yu. Crossing the logarithmic barrier for dynamic boolean data structure lower bounds. SIAM J. Comput., 49(5), 2020.

[40]   Leonid A. Levin. Universal sequential search problems. Problemy Peredachi Informatsii, 9(3):115–116, 1973.

[41]   Carsten Lund, Lance Fortnow, Howard Karloff, and Noam Nisan. Algebraic methods for interactive proof systems. J. of the ACM, 39(4):859–868, October 1992.

[42]   O. B. Lupanov. A method of circuit synthesis. Izv. VUZ Radiofiz., 1:120–140, 1958.

[43]   Wolfgang Maass and Amir Schorr. Speed-up of Turing machines with one work tape and a two-way input tape. SIAM J. on Computing, 16(1):195–202, 1987.

[44]   David A. Mix Barrington. Bounded-width polynomial-size branching programs recognize exactly those languages in NC^1. J. of Computer and System Sciences, 38(1):150–164, 1989.

[45]   Joseph Naor and Moni Naor. Small-bias probability spaces: efficient constructions and applications. SIAM J. on Computing, 22(4):838–856, 1993.

[46]   E. I. Nechiporuk. A boolean function. Soviet Mathematics-Doklady, 169(4):765–766, 1966.

[47]   Valery A. Nepomnjaščiĭ. Rudimentary predicates and Turing calculations. Soviet Mathematics-Doklady, 11(6):1462–1465, 1970.

[48]   NEU. From RAM to SAT. Available at http://www.ccs.neu.edu/home/viola/, 2012.

[49]   Christos H. Papadimitriou and Mihalis Yannakakis. Optimization, approximation, and complexity classes. J. Comput. Syst. Sci., 43(3):425–440, 1991.

[50]   Wolfgang J. Paul, Nicholas Pippenger, Endre SzemerΘdi, and William T. Trotter. On determinism versus non-determinism and related problems (preliminary version). In IEEE Symp. on Foundations of Computer Science (FOCS), pages 429–438, 1983.

[51]   Nicholas Pippenger and Michael J. Fischer. Relations among complexity measures. J. of the ACM, 26(2):361–381, 1979.

[52]   Alexander Razborov. Lower bounds on the dimension of schemes of bounded depth in a complete basis containing the logical addition function. Akademiya Nauk SSSR. Matematicheskie Zametki, 41(4):598–607, 1987. English translation in Mathematical Notes of the Academy of Sci. of the USSR, 41(4):333-338, 1987.

[53]   Omer Reingold. Undirected connectivity in log-space. J. of the ACM, 55(4), 2008.

[54]   J. M. Robson. N by N checkers is exptime complete. SIAM J. Comput., 13(2):252–267, 1984.

[55]   Rahul Santhanam. On separators, segregators and time versus space. In IEEE Conf. on Computational Complexity (CCC), pages 286–294, 2001.

[56]   Walter J. Savitch. Relationships between nondeterministic and deterministic tape complexities. Journal of Computer and System Sciences, 4(2):177–192, 1970.

[57]   Arnold Sch÷nhage. Storage modification machines. SIAM J. Comput., 9(3):490–508, 1980.

[58]   Adi Shamir. IP = PSPACE. J. of the ACM, 39(4):869–877, October 1992.

[59]   Claude E. Shannon. The synthesis of two-terminal switching circuits. Bell System Tech. J., 28:59–98, 1949.

[60]   Victor Shoup. New algorithms for finding irreducible polynomials over finite fields. Mathematics of Computation, 54(189):435–447, 1990.

[61]   Alan Siegel. On universal classes of extremely random constant-time hash functions. SIAM J. on Computing, 33(3):505–543, 2004.

[62]   Michael Sipser. A complexity theoretic approach to randomness. In ACM Symp. on the Theory of Computing (STOC), pages 330–335, 1983.

[63]   Roman Smolensky. Algebraic methods in the theory of lower bounds for Boolean circuit complexity. In 19th ACM Symp. on the Theory of Computing (STOC), pages 77–82. ACM, 1987.

[64]   Larry Stockmeyer and Albert R. Meyer. Cosmological lower bound on the circuit complexity of a small problem in logic. J. ACM, 49(6):753–784, 2002.

[65]   Seinosuke Toda. PP is as hard as the polynomial-time hierarchy. SIAM J. on Computing, 20(5):865–877, 1991.

[66]   Alan M. Turing. On computable numbers, with an application to the entscheidungsproblem. Proc. London Math. Soc., s2-42(1):230–265, 1937.

[67]   Leslie G. Valiant. Graph-theoretic arguments in low-level complexity. In 6th Symposium on Mathematical Foundations of Computer Science, volume 53 of Lecture Notes in Computer Science, pages 162–176. Springer, 1977.

[68]   Leslie G. Valiant and Vijay V. Vazirani. NP is as easy as detecting unique solutions. Theor. Comput. Sci., 47(3):85–93, 1986.

[69]   Dieter van Melkebeek. A survey of lower bounds for satisfiability and related problems. Foundations and Trends in Theoretical Computer Science, 2(3):197–303, 2006.

[70]   Dieter van Melkebeek and Ran Raz. A time lower bound for satisfiability. Theor. Comput. Sci., 348(2-3):311–320, 2005.

[71]   N. V. Vinodchandran. A note on the circuit complexity of PP. Theor. Comput. Sci., 347(1-2):415–418, 2005.

[72]   Emanuele Viola. On approximate majority and probabilistic time. Computational Complexity, 18(3):337–375, 2009.

[73]   Emanuele Viola. On the power of small-depth computation. Foundations and Trends in Theoretical Computer Science, 5(1):1–72, 2009.

[74]   Emanuele Viola. Reducing 3XOR to listing triangles, an exposition. Available at http://www.ccs.neu.edu/home/viola/, 2011.

[75]   Emanuele Viola. Lower bounds for data structures with space close to maximum imply circuit lower bounds. Theory of Computing, 15:1–9, 2019. Available at http://www.ccs.neu.edu/home/viola/.

[76]   Emanuele Viola. Pseudorandom bits and lower bounds for randomized turing machines. Theory of Computing, 18(10):1–12, 2022.

Mathematics of the impossible, Chapter 7, Space

As mentioned in Chapter 2, Time is only one of several resources we with to study. Another important one is space, which is another name for the memory of the machine. Computing under space constraints may be less familiar to us than computing under time constraints, and many surprises lay ahead in this chapter that challenge our intuition of efficient computation.

If only space is under consideration, and one is OK with a constant-factor slackness, then TMs and RAMs are equivalent; much like \text {P} is invariant under power changes in time. In a sense, changing the space by a constant factor is like changing the time by a power; from this point of view the equivalence is not surprising.

We shall consider both space bounds bigger than the input length and smaller. For the latter, we have to consider the input separately. The machine should be able to read the input, but not write on its cells. One way to formalize this is to consider 2TMs, where one tape holds the input and is read-only. The other is a standard read-write tape.

We also want to compute functions f whose output is more than 1 bit. One option is to equip the machine with yet another tape, which is write-only. We prefer to stick to two tapes and instead require that given x,i the i output bit of f(x) is computable efficiently.

Definition 7.1. A function f:X\subseteq \{0,1\}^* \to \{0,1\}^* is computable in \text {Space}(s(n)) if there is a 2TM which on input (x,i) on the first tape, where x\in X and i\le |f(x)|, outputs the i bit of f(x), and the machine never writes on the first tape and never uses more than s(|x|) cells on the second.

We define:

\begin{aligned} \text {L}:= & \bigcup _{d}\text {Space}(d\log n),\\ \text {PSpace}:= & \bigcup _{d}\text {Space}(n^{d}). \end{aligned}

We investigate next some basic relationship between space and time.

Theorem 7.1. For every functions t and s:

  1. \text {\ensuremath {k}TM-Time}(t)\subseteq \text {Space}(c_{k}t),
  2. \text {Time}(t)\subseteq \text {Space}(ct\log (nt)),
  3. \text {Space}(s)\subseteq 2\text {TM-Time}(c^{s(n)}\cdot n^{c}).

Proof1. A TM running in time t can only move its heads at most t tape cells. We can write all these contents in one tape. To simulate one step of the k\text {TM} we do a pass on all the contents and update them accordingly.

2. A RAM running in time t can only access \le t memory cells, each containing at most c\log nt bits; the factor n is to take into account that the machine starts with word length \ge \log n. We simulate this machine and for each Write operation we add a record on the tape with the memory cell index and the content, similar to Theorem ??. When the RAM reads from memory, we scan the records and read off the memory values from one of them. If the record isn’t found, then the simulation returns the value corresponding to the initial configuration.

We also allocate c\log nt bits for the registers of the RAM. It can be shown that the operations among them can be performed in the desired space, since they only involve a logarithmic number of bits. A stronger result is proved later in Theorem 7.5.

3. On an input x with |x|=n, a \text {Space}(s) machine can be in at most n^{c}\cdot c^{s(n)} configurations. The first n^{c} factor is for the head position on the input. The second factor is for the contents of the second tape. Since the machine ultimately stops, configurations cannot repeat, hence the same machine (with no modification) will stop in the desired time. QED

A non-trivial saving is given by the following theorem.

Theorem 7.2. [12] k\text {-TM-Time}(t)\subseteq \text {Space}(c_{k}t/\log t), for every function t.

This result is not specific to MTMs but can be proved for other models such as RAMs.

From Theorem 7.1 and the next exercise we have

\begin{aligned} \text {L}\subseteq \text {P}\subseteq \text {NP}\subseteq \text {PH}\subseteq \text {PSpace}\subseteq \text {Exp}. \end{aligned}

Exercise 7.1. Prove \text {PH}\subseteq \text {PSpace}.

Just like for Time, for space one has universal machines and a hierarchy theorem. The latter implies \text {L}\ne \text {PSpace}. Hence, analogously to the situation for Time and NTime (section º5.1), we know that at least one of the inclusions above between L and PSpace is strict. Most people seem to think that all are, but nobody can prove that any specific one is.

7.1 Branching programs

Branching programs are the non-uniform counterpart of \text {Space}, just like circuits are the non-uniform counterpart of \text {Time}.

Definition 7.2. A (branching) program is a directed acyclic graph. A node can be labeled with an input variable, in which case it has two outgoing edges labeled 0 and 1. Alternatively a node can be labeled with 0 or 1, in which case it has no outgoing edges. One special node is the start node.

The space of the program with S nodes is \log S. A program computes a function f:\zonzo by following the path from the starting node, following edge labels corresponding to the input, and outputting b\in \{0,1\} as soon as it reaches a node labeled b.

Theorem 7.3. Suppose a 2TM M computes f:\zonzo in space s. Then f has branching programs of space c_{M}(s+\log n). In particular, any f\in \text {L} has branching programs of size n^{c_{f}}.

ProofEach node in the program corresponds to a configuration. QED

Definition 7.3. The branching program given by Theorem 7.3 is called the configuration graph of M.

The strongest available impossibility result for branching programs is the following.

Theorem 7.4. [16] There are explicit functions (in \text {P}) that require branching programs of size cn^{2}/\log n on inputs of length n.

7.2 The power of L

Computing with severe space bounds, as in L, seems quite difficult. Also, it might be somewhat less familiar than, say, computing within a time bound. It turns out that L is a powerful class capable of amazing computational feats that challenge our intuition of efficient computation. Moreover, these computational feats hinge on deep mathematical techniques of wide applicability. We hinted at this in Chapter 1. We now give further examples. At the same time we develop our intuition of what is computable with little space.

To set the stage, we begin with a composition result. In the previous sections we used several times the simple result that the composition of two maps in P is also in P. This is useful as it allows us to break a complicated algorithm in small steps to be analyzed separately – which is a version of the divide et impera paradigm. A similar composition result holds and is useful for space, but the argument is somewhat less obvious.

Lemma 7.1. Let f_{1}:\{0,1\}^* \to \{0,1\}^* be in \text {Space}(s_{1}) and satisfy |f_{1}(x)|\le m(|x|) for a function m. Suppose f_{2}:\{0,1\}^* \to \{0,1\} is in \text {Space}(s_{2}).

Then the composed function g defined as g(x)=f_{2}(f_{1}(x)) is computable in space c(s_{2}(m(n))+s_{1}(n)+\log nm(n)).

In particular, if f_{1} and f_{2} are in L then g is in L, as long as m\le n^{d} for a constant d.

Exercise 7.2. Prove this.

7.2.1 Arithmetic

A first example of the power of L is given by its ability to perform basic arithmetic. Grade school algorithms use a lot of space, for example they employ space \ge n to multiply two n-bit integers.

Theorem 7.5. The following problems are in L:

  1. Addition of two input integers.
  2. Iterated addition: Addition of any number of input integers.
  3. Multiplication of two input integers.
  4. Iterated multiplication: Multiplication of any number of input integers.
  5. Division of two integers.

Iterated multiplication is of particular interest because it can be used to compute “pseudorandom functions.” Such objects shed light on our ability to prove impossibility results via the “Natural Proofs” connection which we will see in Chapter ??.

Proof of 1. in Theorem 7.5 We are given as input x,y\in \mathbb {N} and an index i and need to compute bit i of x+y. Starting from the least significant bits, we add the bits of x and y, storing the carry of 1 bit in memory. Output bits are discarded until we reach bit i, which is output. QED

Exercise 7.3. Prove 2. and 3. in Theorem 7.5.

Proving 4. and 5. is more involved and requires some of those deep mathematical tools of wide applicability we alluded to before. Division can be computed once we can compute iterated multiplication. In a nutshell, the idea is to use the expansion

\begin{aligned} \frac {1}{x}=\sum _{i\ge 0}(-1)^{i}(x-1)^{i}. \end{aligned}

We omit details about bounding the error. Instead, we point out that this requires computing powers (x-1)^{i} which is an example of iterated multiplication (and in fact is no easier).

So for the rest of this section we focus on iterated multiplication. Our main tool for this the Chinese-remaindering representation of integers, abbreviated CRR.

Definition 7.4. We denote by \mathbb {Z}_{m} the integers modulo m equipped with addition and multiplication (modulo m).

Theorem 7.6. Let p_{1},...,p_{\ell } be distinct primes and m:=\prod _{i}p_{i}. Then \mathbb {Z}_{m} is isomorphic to \mathbb {Z}_{p_{1}}\times \ldots \times \mathbb {Z}_{p_{\ell }}.

The forward direction of the isomorphism is given by the map

x \in \mathbb {Z}_{m} \rightarrow (x \mod p_{1},x \mod p_{2},...,x \mod p_{\ell }).

For the converse direction, there exist integers e_{1},...,e_{\ell }\leq (p')^{c}, depending only on the p_{i} such that the converse direction is given by the map

(x\mod p_{1},x\mod p_{2},...,x\mod p_{\ell }) \to  x:=\sum _{i=1}^{\ell }e_{i}\cdot (x\mod p_{i}).

Each integer e_{i} is 0 mod p_{j} for j\not =i and is 1 mod p_{i}.

Example 7.1. \mathbb {Z}_{6} is isomorphic to \mathbb {Z}_{2}\times \mathbb {Z}_{3}. The equation 2+3=5 corresponds to (0,2)+(1,0)=(1,2). The equation 2\cdot 3=6 corresponds to (0,2)+(1,0)=(0,0). Note how addition and multiplication in CRR are performed in each coordinate separately; how convenient.

To compute iterated multiplication the idea is to move to CRR, perform the multiplications there, and then move back to standard representation. A critical point is that each coordinate in the CRR has a representation of only c\log n bits, which makes it easy to perform iterated multiplication one multiplication at the time, since we can afford to write down intermediate products.

The algorithm is as follows:

Computing the product of input integers x_{1},\ldots ,x_{t}. [6]

  1. Let \ell :=n^{c} and compute the first \ell prime numbers p_{1},p_{2},\ldots ,p_{\ell }.
  2. Convert the input into CRR: Compute (x_{1}\mod p_{1},\ldots ,x_{1}\mod p_{\ell }),\ldots ,(x_{t}\mod p_{1},\ldots ,x_{t}\mod p_{\ell }).
  3. Compute the multiplications in CRR: (\Pi _{i=1}^{t}x_{i}\mod p_{1}),\ldots ,(\Pi _{i=1}^{t}x_{i}\mod p_{\ell }).
  4. Convert back to standard representation.

Exercise 7.4. Prove the correctness of this algorithm.

Now we explain how steps 1, 2, and 3 can be implemented in L. Step 4 can be implemented in L too, but showing this is somewhat technical due to the computation of the numbers e_{i} in Theorem 7.6. However these numbers only depend on the input length, and so we will be able to give a self-contained proof that iterated multiplication has branching programs of size n^{c}.

Step 1

By Theorem ??, the primes p_{i} have magnitude \le n^{c} and so can be represented with c\log n bits. We can enumerate over integers with \le c\log n bits in L. For each integer x we can test if it’s prime by again enumerating over all integers y and z with \le c\log n bits and checking if x=yz, say using the space-efficient algorithm for multiplication in Theorem 7.5. (The space required for this step would in fact be c\log \log n.)

Step 2

We explain how given y\in \{0,1\} ^{n} we can compute (y\mod p_{1},\ldots ,y\mod p_{\ell }) in L. If y_{j} is bit j of y we have that

Step 3

This is a smaller version of the original problem: for each j\leq \ell , we want to compute (\Pi _{i=1}^{t}x_{i}\mod p_{j}) from x_{1}\mod p_{j},\ldots ,x_{t}\mod p_{j}. However, as mentioned earlier, each (x_{i}\mod p_{j}) is at most n^{c} in magnitude and thus has a representation of c\log n bits. Hence we can just perform one multiplication at the time in L.

Step 4

By Theorem 7.6, to convert back to standard representation from CRR we have to compute the map

\begin{aligned} (y\mod p_{1},\ldots ,y\mod p_{\ell })\to \sum _{i=1}^{\ell }e_{i}\cdot (y\mod p_{i}). \end{aligned}

Assuming we can compute the e_{i}, this is just multiplication and iterated addition, which are in L by Theorem 7.5.

Putting the steps together

Combining the steps together we can compute iterated multiplication in L as long as we are given the integers e_{i} in Theorem 7.6.

Theorem 7.7. Given integers x_{1},x_{2},\ldots ,x_{t}, and given the integers e_{1},e_{2},\ldots ,e_{\ell } as in Theorem 7.6, where \ell =n^{c}, we can compute \prod _{i}x_{i} in L.

In particular, because the e_{i} only depend on the input length, but not on the x_{i} they can be hardwired in a branching program.

Corollary 7.1. Iterated multiplication has branching programs of size n^{c}.

Exercise 7.5. Show that given integers x_{1},x_{2},\ldots ,x_{t} and y_{1},y_{2},\ldots ,y_{t} one can decide if

\begin{aligned} \prod _{i=1}^{t}x_{i}=\prod _{i=1}^{t}y_{i} \end{aligned}

in L. You cannot use the fact that iterated multiplication is in L, a result which we stated but not completely proved.

Exercise 7.6. Show that the iterated multiplication of d\times d matrices over the integers has branching programs of size n^{c_{d}}.

7.2.2 Graphs

We now give another example of the power of L.

Definition 7.5. The undirected reachability problem: Given an undirected graph G and two nodes s and t in G determine if there is a path from s to t.

Standard time-efficient algorithms to solve this problem mark nodes in the graph. In logarithmic space we can keep track of a constant number of nodes, but it is not clear how we can avoid repeating old steps forever.

Theorem 7.8. [20] Undirected reachability is in L.

The idea behind this result is that a random walk on the graph will visit every node, and can be computed using small space, since we just need to keep track of the current node. Then, one can derandomize the random walk and obtain a deterministic walk, again computable in L.

7.2.3 Linear algebra

Our final example comes from linear algebra. Familiar methods for solving a linear system

\begin{aligned} Ax=b \end{aligned}

can be done requires a lot of space. For example using elimination we need to rewrite the matrix A. Similarly, we cannot easily compute determinants using small space. However, a different method exists.

Theorem 7.9. [9] Solving a linear system is computable in \text {Space}(c\log ^{2}n).

7.3 Checkpoints

The checkpoint technique is a fundamental tool in the study of space-bounded computation. Let us illustrate it at a high level. Let us consider a graph G, and let us write u\leadsto ^{t}v if there is a path of length \le t from u to v. The technique allows us to trade the length of the path with quantifiers. Specifically, for any parameter b, we can break down paths from u to v in b smaller paths that go through b-1 checkpoints. The length of the smaller paths needs be only t/b (assuming that b divides t). We can guess the breakpoints and verify each smaller path separately, at the price of introducing quantifiers but with the gain that the path length got reduced from t to t/b. The checkpoint technique is thus an instantiation of the general paradigm of guessing computation and verifying it locally, introduced in Chapter 5. One difference is that now we are only going to guess parts of the computation.

The checkpoint technique   u\leadsto ^{t}v\Leftrightarrow \exists p_{1},p_{2},\ldots ,p_{b-1}:\forall i\in \{0,1,b-1\}:p_{i}\leadsto ^{t/b}p_{i+1},   where we denote p_{0}:=u and p_{b}:=v.

An important aspect of this technique is that it can be applied recursively: We can apply it again to the problems p_{i}\leadsto ^{t/b}p_{i+1}. We need to introduce more quantifiers, but we can reduce the path length to t/b^{2}, and so on. We will see several instantiations of this technique, for various settings of parameters, ranging from b=2 to b=n^{c}.

We now utilize the checkpoint technique to show a simulation of small-space computation by small-depth alternating circuits.

Theorem 7.10. A function computable by a branching program with S nodes is also computable by an alternating circuit of depth c\log _{b}S and size S^{b\log _{b}S+c}, for any b\le S.

To illustrate the parameters, suppose S=n^{a}, and let us pick b:=n^{\epsilon } where n\ge c_{a,\epsilon }. Then we have circuits of depth d:=ca/\epsilon and size \le S^{n^{\epsilon }d+c}=2^{n^{c\epsilon }}. In other words, we can have depth d and size 2^{n^{c_{a}/d}}, for every d. This in particular holds for every function in \text {L}. We will later give explicit functions (also in \text {P}) which cannot be computed by circuits of depth d and size 2^{n^{c/d}}, “just short” of ruling out \text {L}. This state of affairs is worth emphasis:

(1) Every f in \text {L} has alternating circuits of depth d and size 2^{n^{c_{f}/d}}. (2) We can prove that there are explicit functions (also in L) which cannot be computed by circuits of depth d and size 2^{n^{c/d}}. (3) Improving the constant in the double exponent for a function in P would yield \text {L}\ne \text {P}. In this sense, the result in (2) is the best possible short of proving a major separation.

ProofWe apply the checkpoint technique to the branching program, recursively, with parameter b. For simplicity we first assume that S is a power of b. Each application of the technique reduces the path length by a factor b. Hence with \log _{b}S applications we can reduce the path length to 1.

In one application, we have an \exists quantifier over b-1 nodes, corresponding to an Or gate with fan-in S^{b-1}, and then a \forall quantifier over b smaller paths, corresponding to an And gate with fan-in b. This gives a tree with S^{b-1}\cdot b\le S^{b} leaves. Iterating, the number of leaves will be

\begin{aligned} (S^{b})^{\log _{b}S}. \end{aligned}

Each leaf can be connected to the input bit on which it depends. The size of the circuit is at most c times the number of leaves.

If S is not a power of b we can view the branching program as having S'\le bS nodes where S' is a power of b . QED

The following is a uniform version of Theorem 7.10, and the proof is similar. It shows that we can speed up space-bounded computation with alternations.

Theorem 7.11. [17]Any f\in L is also in \Sigma _{c_{f}/\epsilon }\text {Time}(n^{\epsilon }), for any \epsilon >0.

ProofLet G be the configuration graph of f. Note this graph has n^{c_{f}} nodes. We need to decide if the start configuration reaches the accept configuration in this graph within t:=n^{c_{f}} steps.

We apply to this graph the checkpoint technique recursively, with parameter b:=n^{\epsilon /2}. Each application of the technique reduces the path length by a factor b. Hence with c_{f}/\epsilon applications we can reduce the path length to

\begin{aligned} \frac {t}{b^{c_{f}/\epsilon }}=\frac {n^{c_{f}}}{n^{c_{f}/2}}\le 1. \end{aligned}

Each quantifier ranges over b\log n^{c_{f}}=c_{f}n^{\epsilon /2}\log n\le n^{\epsilon } bits for large enough n.

There remains to check a path of length 1, i.e., an edge. The endpoints of this edge are two configurations u and v which depend on the quantified bits. The machine can compute the two endpoints in time \log ^{c}m where m is the total number of quantified bits, using rapid access. Once it has u and v it can check if u leads to v in one step by reading one bit from the input. Note m\le n^{\epsilon }\cdot c_{f}/\epsilon , so \log ^{c}m\le c_{f}n^{\epsilon }. QED

We note that by the generic simulation of alternating time by small-depth circuits in Lemma 6.2, the above theorem also gives a result similar to Theorem 7.10.

7.4 Reductions: L vs. P

Again, we can use reductions to related the space complexity of problems. In particular we can identify the problems in P which have space-efficient algorithms iff every problem in P does.

Definition 7.6. A problem f is P-complete if f\in \text {P} and f\in \text {L}\Rightarrow \text {L}=\text {P}.

Definition 7.7. The circuit value problem: Given a circuit C and an input x, compute C(x).

Theorem 7.12. Circuit value is P-complete.

ProofCircuit value is in P since we can evaluate one gate at the time. Now let g\in \text {P}. We can reduce computing g on input x to a circuit value instance, as in the simulation of TMs by circuits in Theorem ??. The important point is that this reduction is computable in L. Indeed, given an index to a gate in the circuit, we can compute the type of the gate and index to its children via simple arithmetic, which is in L by Theorem 7.5, and some computation which only depends on the description of the P-time machine for g. QED

Definition 7.8. The monotone circuit value problem: Given a circuit C with no negations and an input x, compute C(x).

Exercise 7.7. Prove that monotone circuit value is P-complete.

Recall from section 7.2.3 that finding solutions to linear systems

\begin{aligned} Ax=b \end{aligned}

has space-efficient algorithms. Interestingly, if we generalize equalities to inequalities the problem becomes P complete. This set of results illustrates a sense in which “linear algebra” is easier than “optimization.”

Definition 7.9. The linear inequalities problem: Given a d\times d matrix A of integers and a d-dimensional vector, determine if the system Ax\le b has a solution over the reals.

Theorem 7.13. Linear inequalities is P-complete.

ProofThe ellipsoid algorithm shows that Linear inequalities is in P, but we will not discuss this classic result.

Instead, we focus on showing how given a circuit C and an input x we can construct a set of inequalities that are satisfiable iff C(x)=1.

We shall have as many variables v_{i} as gates in the circuit, counting input gates as well.

For an input gate g_{i}=x_{i} add equation v_{i}=x_{i}.

For a Not gate g_{i}=\text {Not}(g_{j}) add equation v_{i}=1-v_{j}.

For an And gate g_{i}=\text {And}(g_{j},g_{k}) add equations 0\le v_{i}\le 1,v_{i}\le v_{j},v_{i}\le v_{k},v_{j}+v_{k}-1\le v_{i}.

The case of Or is similar, or can be dispensed by writing an Or using Not and And.

Finally, if g_{i} is the output gate add equation v_{i}=1.

We claim that in every solution to Av\le b the value of v_{i} is the value in \{0,1\} of gate g_{i} on input x. This can be proved by induction. For input and Not gates this is immediate. For an And gate, note that if v_{j}=0 then v_{i}=0 as well because of the equations v_{i}\ge 0 and v_{i}\le v_{j}. The same holds if v_{k}=0. If both v_{j} and v_{k} are 1 then v_{i} is 1 as well because of the equations v_{i}\le 1 and v_{j}+v_{k}-1\le v_{i}. QED

7.4.1 Nondeterministic space

Because of the insight we gained from considering non-deterministic time-bounded computation in section º5.1, we are naturally interested in non-deterministic space-bounded computation. In fact, perhaps we will gain even more insight, because this notion will really challenge our understanding of computation.

For starters, let us define non-deterministic space-bounded computation. A naive approach is to define it using the quantifiers from section º6.4, leading to the class \exists \cdot \text {L}. This is an ill-fated choice:

Exercise 7.8. Prove \exists \cdot \text {L}=\exists \cdot \text {P}.

Instead, non-deterministic space is defined in terms of non-deterministic TMs.

Definition 7.10. A function f:\{0,1\}^* \to \{0,1\} is computable in \text {NSpace}(s(n)) if there is a two-tape TM which on input x never writes on the first tape and never uses more than s(n) cells on the second, and moreover:

1. The machine is equipped with a special “Guess” state. Upon entering this state, a guess bit is written on the work tape under the head.

2. f(x)=1 iff there exists a choice for the guess bits that causes the machine to output 1.

We define

\begin{aligned} \text {NL}:=\bigcup _{d}\text {NSpace}(d\log n). \end{aligned}

How can we exploit this non-determinism? Recall from section 7.2.2 that reachability in undirected graphs is in L. It is unknown if the same holds for directed graphs. However, we can solve it in NL.

Definition 7.11. The directed reachability problem: Given a directed graph G and two nodes s and t, decide if there is a path from s to t.

Theorem 7.14. Directed reachability \text {is in NL}.

ProofThe proof simply amounts to guessing a path in the graph. The algorithm is as follows:

“On input G,s,t, let v:=s.

For i=0 to |G|:

      If v=t, accept.

      Guess a neighbor w of v. Let v:=w.

If you haven’t accepted, reject.”

The space needed is |v|+|i|=c\log |G|. QED

We can define NL completeness similarly to NP and P completeness, and have the following result.

Theorem 7.15. Directed reachability is NL-complete. That is, it is in NL and it is in L iff \text {L}=\text {NL}.

Exercise 7.9. Prove this.

7.4.2 Nspace vs. time

Recall by definition \text {Space}(s(n))\subseteq \text {NSpace}(s(n)). We showed \text {Space}(s(n))\subseteq \text {Time}(n^{c}c^{s(n)}) in Theorem 7.1. We can strengthen the inclusion to show that it holds even for non-deterministic space.

Theorem 7.16. \text {NSpace}(s(n))\subseteq \text {Time}(n^{c}c^{s(n)}).

ProofOn input x, we compute the configuration graph G of M on input x. The nodes are the configurations, and there is an edge from u to v if the machine can go from u to v in one step. Then we solve reachability on this graph in power time, using say breadth-first-search. QED

The next theorem shows that non-deterministic space is not much more powerful than deterministic space: it buys at most a square. Contrast this with the P vs. NP question! The best determistic simulation of NP that we know is the trivial \text {NP}\subseteq \text {Exp}. Thus the situation for space is entirely different.

How can this be possible? The high-level idea, which was used already in Lemma 7.1, can be cast as follows:

Unlike time, space can be reused.

Theorem 7.17. [23] \text {NSpace}(s(n))\subseteq \text {Space}(cs^{2}(n)), for every s(n)\ge \log n. In particular, \text {NPSpace}=\text {PSpace}.

ProofWe use the checkpoint technique with parameter b=2, and re-use the space to verify the smaller paths. Let N be a non-deterministic TM computing a function in \text {NSpace}(s(n)). We aim to construct a deterministic TM M that on input x returns

\begin{aligned} \text {Reach}(C_{\text {start}},C_{\text {accept}},c^{s(n)}), \end{aligned}

where \text {Reach}(u,v,t) decides if v is reachable from u in \le t steps in the configuration graph of N on input x, and C_{\text {start}} is the start configuration, C_{\text {accept}} is the accept configuration, and c^{s(n)} is the number of configurations of N.

The key point is how to implement Reach.

\text {Computing\text { Reach}\ensuremath {(u,v,t)}}

For all “middle” configurations m

\text {If both \text {Reach}\ensuremath {(u,m,t/2)=1} and \text {Reach}\ensuremath {(m,v,t/2)=1} then Accept.}

Reject

Let S(t) denote the space needed for computing \text {Reach}(u,v,t). We have

\begin{aligned} S(t)\le cs(n)+S(t/2). \end{aligned}

This is because we can re-use the space for two calls to Reach. Therefore, the space for \text {Reach}(C_{\text {start}},C_{\text {accept}},c^{s(n)}) is

\begin{aligned} \le cs(n)+cs(n)+\ldots +cs(n)\le cs^{2}(n). \end{aligned}

QED

Recall that we do not know if \text {Ntime}(t) is closed under complement. It is generally believed not to be the case, and we showed that if it is then the PH collapses Exercise 6.3.

What about space? Is \text {NSpace}(s) closed under complement? Theorem 7.17 shows \text {NSpace}(s(n))\subseteq \text {Space}(cs^{2}(n)). So if f\in \text {NSpace}(s(n)) then the complement function 1-f is in \text {Space}(cs^{2}(n)) (and in particular in \text {Space}(cs^{2}(n))). Hence, up to a quadratic loss in space, non-deterministic space is closed under complement.

Can we avoid squaring the space?

Yes! This is weird!

Theorem 7.18. The complement of Path is in NL.

ProofWe want a non-deterministic 2TM that given G,s,\text {and }t accepts if there is no path from s to t in G.

For starters, suppose the machine has computed the number C of nodes reachable from s. The key idea is that there is no path from s to t iff there are C nodes different from t reachable from s. Thus, knowing C we can solve the problem as follows

 

Algorithm for deciding if there is no path from s to t, given C:

  

\text {Initialize \text {Count}=0; Enumerate over all nodes \ensuremath {v\ne t}}

\text {Guess a path from \ensuremath {s} of length \ensuremath {|G|}. If path reaches \ensuremath {v}, increase \text {Count} by \ensuremath {1} }

If \text {Count}=C Accept, else Reject.

 

There remains to compute C.

Let A_{i} be the nodes at distance \le i from s, and let C_{i}:=|A_{i}|. Note A_{0}=\{s\},c_{0}=1. We seek to compute C=C_{n}.

To compute C_{i+1} from C_{i}, enumerate nodes v (candidate in A_{i+1}). For each v, enumerate over all nodes w in A_{i}, and check if w\to v is an edge. If so, increase C_{i+1} by 1.

The enumeration over A_{i} is done guessing C_{i} nodes and paths from s. If we don’t find C_{i} nodes, we reject. QED

7.5 TiSp

So far in this chapter we have focused on bounding the space usage. For this, the TM model was sufficient, as remarked at the beginning. It is natural to consider algorithms that operate in little time and space. For this, of course, whether we use TMs or RAMs makes a difference.

Definition 7.12. Let \text {TiSp}(t,s) be the functions computable on a RAM that on every input x\in \{0,1\} ^{n} runs in time t(n) and only uses memory cells 0 to s(n)-1.

Exercise 7.10. Prove \text {L}=\bigcup _{d}\text {TiSp}(n^{d},d).

An alternative definition of TiSp would allow the RAM to access s(n) cells anywhere in memory. One can maintain a data structure to show that this alternative definition is equivalent to Definition 7.12.

To illustrate the relationship between TiSp, Time, and Space, consider undirected reachability. It is solvable in Time(n\log ^{c}n) by bradth-first search, and in logarithmic space by Theorem 7.8. But it isn’t known if it is in \text {TiSp}(n\log ^{a}n,a\log n) for some constant a.

Exercise 7.11. Prove that Theorem 7.11 holds even for any function f in \text {TiSp}(n^{a},n^{1-\epsilon }) for any a and \epsilon >0.

The following is a non-uniform version of TiSp.

Definition 7.13. A branching program of length t and width W is a branching program where the nodes are partitioned in t layers L_{1},L_{2},\ldots ,L_{t} where nodes in L_{i} only lead to nodes in L_{i+1}, and |L_{i}|\le W for every i.

Thus t represents the time of the computation, and \log W the space.

Recall that Theorem 7.10 gives bounds of the form \ge cn^{2}/\log n on the size of branching program (without distinguishing between length and width). For branching programd og length t and width W this bound gives t\ge cn^{2}/W\log n. But it gives nothing for power width like W=n^{2}. The state-of-the-art for power width is t\ge \Omega (n\sqrt {\log n/\log \log n}) [27] (in fact the bound holds even for subexponential) .

With these definitions in hand we can refine the connection between branching programs and small-depth circuits in Theorem 7.10 for circuits of depth 3.

Theorem 7.19. Let f:\{0,1\} ^{n}\to \{0,1\} be computable by a branching program with width W and time t. Then f is computable by an alternating depth-3 circuit with \le 2^{c\sqrt {t\log W}} wires.

We will later show explicit functions that require depth-3 circuits of size 2^{c\sqrt {n}}. Theorem 7.19 shows that improving this would also improve results for small-width branching programs, a refinement of the message emphasized before after Theorem 7.10.

Exercise 7.12. Prove this.

7.6 Notes

For a discussion of the complexity of division, see [3]. For a compendium of P-complete problems see [11].

References

[1]   Mikl≤s Ajtai. \Sigma \sp {1}\sb {1}-formulae on finite structures. Annals of Pure and Applied Logic, 24(1):1–48, 1983.

[2]   Mikl≤s Ajtai. A non-linear time lower bound for boolean branching programs. Theory of Computing, 1(1):149–176, 2005.

[3]   Eric Allender. The division breakthroughs. Bulletin of the EATCS, 74:61–77, 2001.

[4]   Noga Alon, Oded Goldreich, Johan Hσstad, and RenΘ Peralta. Simple constructions of almost k-wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992.

[5]   Kenneth E. Batcher. Sorting networks and their applications. In AFIPS Spring Joint Computing Conference, volume 32, pages 307–314, 1968.

[6]   Paul Beame, Stephen A. Cook, and H. James Hoover. Log depth circuits for division and related problems. SIAM J. Comput., 15(4):994–1003, 1986.

[7]   Paul Beame, Michael Saks, Xiaodong Sun, and Erik Vee. Time-space trade-off lower bounds for randomized computation of decision problems. J. of the ACM, 50(2):154–195, 2003.

[8]   Stephen A. Cook. A hierarchy for nondeterministic time complexity. J. of Computer and System Sciences, 7(4):343–353, 1973.

[9]   L. Csanky. Fast parallel matrix inversion algorithms. SIAM J. Comput., 5(4):618–623, 1976.

[10]   Aviezri S. Fraenkel and David Lichtenstein. Computing a perfect strategy for n x n chess requires time exponential in n. J. Comb. Theory, Ser. A, 31(2):199–214, 1981.

[11]   Raymond Greenlaw, H. James Hoover, and Walter Ruzzo. Limits to Parallel Computation: P-Completeness Theory. 02 2001.

[12]   John E. Hopcroft, Wolfgang J. Paul, and Leslie G. Valiant. On time versus space. J. ACM, 24(2):332–337, 1977.

[13]   Richard M. Karp and Richard J. Lipton. Turing machines that take advice. L’Enseignement MathΘmatique. Revue Internationale. IIe SΘrie, 28(3-4):191–209, 1982.

[14]   Leonid A. Levin. Universal sequential search problems. Problemy Peredachi Informatsii, 9(3):115–116, 1973.

[15]   Joseph Naor and Moni Naor. Small-bias probability spaces: efficient constructions and applications. SIAM J. on Computing, 22(4):838–856, 1993.

[16]   E. I. Nechiporuk. A boolean function. Soviet Mathematics-Doklady, 169(4):765–766, 1966.

[17]   Valery A. Nepomnjaščiĭ. Rudimentary predicates and Turing calculations. Soviet Mathematics-Doklady, 11(6):1462–1465, 1970.

[18]   NEU. From RAM to SAT. Available at http://www.ccs.neu.edu/home/viola/, 2012.

[19]   Wolfgang J. Paul, Nicholas Pippenger, Endre SzemerΘdi, and William T. Trotter. On determinism versus non-determinism and related problems (preliminary version). In IEEE Symp. on Foundations of Computer Science (FOCS), pages 429–438, 1983.

[20]   Omer Reingold. Undirected connectivity in log-space. J. of the ACM, 55(4), 2008.

[21]   J. M. Robson. N by N checkers is exptime complete. SIAM J. Comput., 13(2):252–267, 1984.

[22]   Rahul Santhanam. On separators, segregators and time versus space. In IEEE Conf. on Computational Complexity (CCC), pages 286–294, 2001.

[23]   Walter J. Savitch. Relationships between nondeterministic and deterministic tape complexities. Journal of Computer and System Sciences, 4(2):177–192, 1970.

[24]   Michael Sipser. A complexity theoretic approach to randomness. In ACM Symp. on the Theory of Computing (STOC), pages 330–335, 1983.

[25]   Seinosuke Toda. PP is as hard as the polynomial-time hierarchy. SIAM J. on Computing, 20(5):865–877, 1991.

[26]   Leslie G. Valiant and Vijay V. Vazirani. NP is as easy as detecting unique solutions. Theor. Comput. Sci., 47(3):85–93, 1986.

[27]   N. V. Vinodchandran. A note on the circuit complexity of PP. Theor. Comput. Sci., 347(1-2):415–418, 2005.

[28]   Emanuele Viola. On approximate majority and probabilistic time. Computational Complexity, 18(3):337–375, 2009.

Mathematics of the impossible: Computational Complexity, Chapter 6, Alternation

We placed one quantifier “in front” of computation and got something interesting: NP. So let’s push the envelope.

Definition 6.1. \Sigma _{i}Time(t(n)) is the set of functions f:X\subseteq \{0,1\}^* \to \{0,1\} for which there is a RAM M such that on input (x,y_{1},y_{2},\ldots ,y_{i}) stops within t(|x|) steps and

\begin{aligned} f(x)=1\Leftrightarrow \exists y_{1}\forall y_{2}\exists y_{3}\forall y_{4}\ldots Q_{i}y_{i}\in \{0,1\} ^{t(|x|)}:M(x,y_{1},y_{2},\ldots ,y_{i})=1. \end{aligned}

\Pi _{i}Time(t(n)) is defined similarly except that we start with a \forall quantifier. We also define

\begin{aligned} \text {\ensuremath {\Sigma _{i}\text {P}}}:= & \bigcup _{d}\Sigma _{i}\text {Time}(n^{d}),\\ \text {\ensuremath {\Pi _{i}\text {P}}}:= & \bigcup _{d}\Pi _{i}\text {Time}(n^{d}),\text { and}\\ \text {the power hiearchy PH}:= & \bigcup _{i}\Sigma _{i}\text {P}=\bigcup _{i}\Pi _{i}\text {P}. \end{aligned}

We refer to such computation and the corresponding machines as alternating, since they involve alternation of quantifiers and we will soon see a connection with alternating circuits.

As for NP, Definition 5.1, note that the running time of M is a function of |x| only. Again, this difference is inconsequential for \Sigma _{i}\text {P}, since the composition of two powers is another power. But it is important for a more fine-grained analysis.

Exercise 6.1. Min-Ckt is the problem of deciding if an input circuit has an equivalent circuit which is smaller. It is not known to be in NP. In which of the above classes can you place it?

6.1 Does the PH collapse?

We refer to the event that \exists i:\Sigma _{i}\text {P}=\text {PH} as “the PH collapses.” It is unknown if the PH collapses. Most people appear to believe that it does not, and to consider statements of the type

\begin{aligned} X\Rightarrow \text {PH collapses} \end{aligned}

as evidence that X is false. Examples of such statements are discussed next.

Theorem 6.1. \text {P}=\text {NP}\Rightarrow \text {P}=\text {PH}.

The idea in the proof is simply that if you can remove a quantifier then you can remove more.

ProofWe prove by induction on i that \Sigma _{i}\text {P}\bigcup \Pi _{i}\text {P}=\text {P}.

The base case i=1 follows by assumption and the fact that P is closed under complement.

Next we do the induction step. We assume the conclusion is true for i and prove it for i+1. We will show \Sigma _{i+1}\text {P}=\text {P}. The result about \Pi _{i+1}P follows again by complementing.

Let L\in \sum _{i+1}\text {P}, so \exists a and a power-time TM M such that for any x\in \{0,1\} ^{n},

\begin{aligned} x\in L\Leftrightarrow \exists y_{1}\forall y_{2}\exists y_{3}\forall y_{4}\ldots Q_{i+1}y_{i+1}\in \{0,1\} ^{n^{a}}:M(x,y_{1},y_{2},\ldots ,y_{i+1})=1. \end{aligned}

(As discussed after Definition 6.1 we don’t need to distinguish between time as a function of |x| or of |(x,y_{1},y_{2},\text {\dots },y_{i+1})| when considering power times as we are doing now.)

Now the creative step of the proof is to consider

\begin{aligned} L':=\left \{ (x,y_{1}):\forall y_{2}\in \{0,1\} ^{n^{a}}\ldots Q_{i+1}y_{i+1}\in \{0,1\} ^{n^{a}}:M(x,y_{1},y_{2},\ldots ,y_{i+1})=1\right \} . \end{aligned}

Note L'\in \Pi _{i}\text {P}. By induction hypothesis L'\in \text {P}. So let TM M' solve L' in power time. So x\in L\iff \exists y_{1}\in \{0,1\} ^{n^{a}}:M'(x,y_{1})=1. And so L\in \text {NP\ensuremath {=\text {P}}}, again using the hypothesis. QED

Exercise 6.2. Prove the following strengthening of Theorem 6.1:

\begin{aligned} \bigcup _{d}\text {NTime}(dn)\subseteq \text {Time}(n^{1+\epsilon })\Rightarrow \bigcup _{d}\text {\ensuremath {\Sigma _{i}\text {Time}(dn)\subseteq \text {Time}(n^{1+\epsilon ^{c_{i}}})}.} \end{aligned}

Exercise 6.3. Show that if \Sigma _{i}\text {P}=\Pi _{i}\text {P} for some i then the PH collapses to \Sigma _{i}\text {P}, that is, \text {PH}=\Sigma _{i}\text {P}.

Theorem 6.2. [21] \text {NP}\subseteq \text {PCkt}\Rightarrow \text {PH}=\Sigma _{2}\text {P}.

ProofWe’ll show \Pi _{2}\text {P}\subseteq \Sigma _{2}\text {P} and then appeal to Exercise 6.3. Let f\in \Pi _{2}\text {Time}(n^{d}) and M be a corresponding machine s.t. 

\begin{aligned} f(x)=1\Leftrightarrow \forall y_{1}\in \{0,1\} ^{n^{d}}\exists y_{2}\in \{0,1\} ^{n^{d}}:M(x,y_{1},y_{2})=1. \end{aligned}

We claim the following equivalent expression for the right-hand side above:

\begin{aligned} \forall y_{1}\in \{0,1\} ^{n^{d}}\exists y_{2}\in \{0,1\} ^{n^{d}}:M(x,y_{1},y_{2})=1\Leftrightarrow \exists C\forall y_{1}\in \{0,1\} ^{n^{d}}:M(x,y_{1},C(x,y_{1}))=1, \end{aligned}

where C ranges over circuits of size |x|^{d'} for some d'. If the equivalence is established the result follows, since evaluating a circuit can be done in power time.

To prove the equivalence, first note that the the \Leftarrow direction is obvious, by setting y_{2}:=C(x,y_{1}). The interesting direction is the \Rightarrow . We claim that under the assumption, there is a circuit that given x,y_{1} outputs a string y_{2} that makes M(x,y_{1},y_{2}) accept, if there is such a string.

To verify this, consider the problems CktSat and Search-CktSat which are analogous to the 3Sat and Search-3Sat problems but for general circuits rather than 3CNF. CktSat is in NP, and so by assumption has power-size circuits. By the reduction in Theorem 4.5, Search-CktSat has power-size circuits S as well. Hence, the desired circuit C may, on input x and y_{1} produce a new circuit W mapping an input y_{2} to M(x,y_{1},y_{2}), and run S on W. QED

Exercise 6.4. Prove that \text {PH}\not \subseteq \text {CktGates}(n^{k}), for any k\in \mathbb {N}. (Hint: Existentially guess the truth table of a hard function.)

Improve this to \text {\ensuremath {\Sigma _{2}\text {P}}}\not \subseteq \text {CktGates}(n^{k}).

Exercise 6.5. Prove \text {Exp}\subseteq \text {PCkt}\Rightarrow \text {Exp}=\Sigma _{2}\text {P}.

6.2 PH vs. alternating circuits

As suggested by the word alternation in Definition 6.1, the power hierarchy PH and its subclasses can be seen as alternating circuits. Before presenting this connection it is useful to write problems in PH in a specific format. The next lemma shows that we can restrict the machine M in Definition 6.1 to read only one bit of the input x. The price for this is that we are going to have to introduce an extra quantifier, however this new quantifier will only range over \log t bits.

Lemma 6.1. Let L\in \Sigma _{i}\text {P}. Then there exists a RAM M s.t.:

\begin{aligned} x\in L\Leftrightarrow & \exists y_{1}\in \{0,1\} ^{t(|x|)}\forall y_{2}\in \{0,1\} ^{t(|x|)}\ldots Q_{i-1}y_{i-1}\in \{0,1\} ^{t(|x|)}\\ & Q_{i}(y_{i},z)\in \{0,1\} ^{2t(|x|)}Q_{i+1}y_{i+1}\in \{0,1\} ^{\log t(|x|)}:M(x,y_{1},y_{2},\ldots ,y_{i+1})=1, \end{aligned}

and M on input (x,y_{1},y_{2},\ldots ,y_{i+1}) stops within ct(|x|) steps and only reads one bit of x.

Note the first i-1 quantifiers are over t bits and unchanged from Definition 6.1, the next one is over 2t bits, written as a pair (y_{i},z), and the last is over \log t. The idea is… you guessed it! We are going to guess the bits read from the input, and verify that each of them is correct.

ProofLet M' be the machine corresponding to L in Definition 6.1. We assume that Q_{i}=\exists , i.e., i is odd. The case Q_{i}=\forall is left for exercise.

We enlarge Q_{i} to quantify over additional t bits z which correspond to the bits of x read by M'. The desired machine M simulates M' with the following change. At any time step s, if M' reads bit j of x:

(1) If s\ne y_{i+1} then M does not read x and instead uses bit z_{s},

(2) If s=y_{i+1} then M reads the corresponding bit of x, and it also checks that this bit equals z_{s}. If it doesn’t M rejects.

By inspection, M reads at most one bit of x.

Correctness is argued as follows. For any y_{1},y_{2},\ldots ,y_{i}, if M' accepts x then there are values z for the bits read from the input x that cause M to accept. Conversely, if M accepts for some z then this z matches the bits read from x by M', for else M would reject in (2). Hence M' accepts x as well. QED

We now show how to simulate computation in PH by small-depth circuits. The size of the circuit is exponential in the time; such a simulation would not be interesting if the circuit is not explicit and the time is more than the input length, since every function on n bits has circuits of size 2^{n} (Theorem 2.3). By contrast, the simulation will give explicit circuits and also apply to running times less than the input length. The setting of running time less than the input length makes the simulation interesting even for non-explicit circuits, and is soon to play a critical role.

Lemma 6.2. Any function in \Sigma _{d}\text {Time}(t)\bigcup \Pi _{d}\text {Time}(t) has on inputs of length n alternating circuits of depth d+1 and 2^{cdt(n)} gates. The fan-in of each gate is \le 2^{ct(n)} and the fan-in of the gates closest to the input is \le t(n). Moreover, the circuit is explicit in the following sense: Given an index to a gate g of fan-in h and a number i\le h we can compute the index of child i of g in \text {Time}(cn).

In fact, most of this circuit only depends on t. The only thing that depends on the actual function being computed are the connections between the gates closest to the input and the input.

ProofWe apply Lemma 6.1. Then an \exists (resp. \forall ) quantifier on b bits corresponds to an Or (resp. And) gate with fan-in 2^{b}. A gate g closest to the input correspond to the computation of the RAM for fixed quantified variables. This computation depends on at most one bit of x. If this computation is a constant independent of x, we simply replace this gate with the appropriate constant. Otherwise, if it depends on a bit x_{j} of x then the computation of the RAM is either x_{j} or \neg x_{j}. Thus we can connect gate g to either input gate x_{j} or input gate \neg x_{j}.

An index to a gate g next to the input is just an assignment to the quantified variables y_{i}. Given such an index and i\le t we can compute in linear time which input bit it depends on. This is done by simulating the machine until the i-th time it reads an input bit. Note this simulation runs in time ct which is linear in the length of an index to the gate. QED

6.3 BPP in PH

It is not known if BPP is contained in NP. However, we can show that BPP is in PH. More precisely, the following two simulations are known. The first optimizes the number of quantifiers, the second the time. This should be contrasted with various conditional results suggesting that in fact a quasilinear deterministic simulation (with no quantifiers) is possible.

Theorem 6.3. For every function t we have:

(1) [35] \text {BPTime}(t)\subseteq \Sigma _{2}\text {Time}(t^{2}\log ^{c}t), and

(2) [4
1
]
\text {BPTime}(t)\subseteq \Sigma _{3}\text {Time}(t\log ^{c}t).

A good way to think of these results is as follows. Fix a \text {BPTime}(t) machine M and an input x\in \{0,1\} ^{n}. Now the alternating machine is trying to decide if for most choices of the random bits y we have M(x,y)=1, or if for most choices we have M(x,y)=0. This can be seen as a version of the Majority problem, with two critical features. The first is that it is succinct in the sense of section 5.4.1, that is, the alternating machine does not have access to the exponentially-long majority instance, but rather has access to a small circuit M(x,\cdot ) s.t. M(x,y) is bit y of the majority instance. The second is that instances have a gap. We define this gap-majority problem next. We define the non-succinct version which is of independent interest, and so use the letter n to indicate input length. But recall that for the application later in this section the input is given succinctly, as just remarked, so the gap majority instance will actually be on a number of bits that is exponential in the input length to the alternating machine.

Definition 6.2. \text {Gap-Maj}_{\alpha ,\beta } is the problem of deciding if an input x\in \{0,1\} ^{n} has weight |x|\le \alpha n or |x|\ge \beta n.

As discussed in section º6.2, it is useful to think of alternating computation as alternating circuits. Indeed, the circuit result that is the starting point of all these simulations is the following somewhat surprising construction of small-depth alternating circuits for Gap-Maj. By contrast, (non-gap) Maj does not have small constant-depth alternating circuits, as we will prove in Chapter ??.

Lemma 6.3. [3]\text { Gap-Maj}_{1/3,2/3}(x) has alternating circuits of depth 3 and size n^{c}. Moreover, the gates at distance 1 from the input have fan-in \le c\log n.

ProofThis is a striking application of the probabilistic method. For a fixed pair of inputs (x,y) we say that a distribution C on circuits gives (\le p,\ge q) if \mathbb {P}_{C}[C(x)=1]\le p and \mathbb {P}_{C}[C(y)=1]\ge q; and we similarly define gives with reverse inequalities. Our goal is to have a distribution that gives

\begin{aligned} (\le 2^{-n},\ge 1-2^{-n})~~~~(6.1) \end{aligned}

for every pair (x,y)\in \{0,1\} ^{n}\times \{0,1\} ^{n} where |x|\le n/3 and |y|\ge 2n/3. Indeed, if we have that we can apply a union bound over the <2^{n} inputs to obtain a fixed circuit that solves Gap-Maj.

We construct the distribution C incrementally. Fix any pair (x,y) as above. Begin with the distribution C_{\wedge } obtained by picking 2\log n bits uniformly from the input and computing their And. This gives

\begin{aligned} \left ((1/3)^{2\log n},(2/3)^{2\log n}\right ). \end{aligned}

Let p:=(1/3)^{2\log n} and note (2/3)^{2\log n}=p\cdot n^{2}. So we can say that C_{\wedge } gives

\begin{aligned} \left (\le p,\ge p\cdot n^{2}\right ). \end{aligned}

Now consider the distribution C_{\vee } obtained by complementing the circuits in C_{\wedge }. This gives

\begin{aligned} \left (\ge 1-p,\le 1-p\cdot n^{2}\right ). \end{aligned}

Next consider the distribution C_{\wedge \vee } obtained by taking the And of m:=p^{-1}/n independent samples of C_{\vee }. This gives

\begin{aligned} \left (\ge (1-p)^{m},\le (1-p\cdot n^{2})^{m}\right ). \end{aligned}

To make sense of these quantities we use the basic approximations

\begin{aligned} 1+z \le e^{z }\le 1+2z \end{aligned}

valid for all z \in [0,1]. These imply (1-p)^{m}\ge e^{-pm/2}=e^{-1/(2n)}\ge 0.9 and (1-p\cdot n^{2})^{m}\le e^{-n}. Summarizing, this gives

\begin{aligned} (\ge 0.9,\le e^{-n}). \end{aligned}

Next consider the distribution C_{\vee \wedge } obtained by complementing the circuits in C_{\wedge \vee }. This gives

\begin{aligned} (\le 0.1,\ge 1-e^{-n}). \end{aligned}

Finally, consider the distribution C_{\wedge \vee \wedge } obtained by taking the And of n independent samples of C_{\vee \wedge }. This gives

\begin{aligned} \left (\le 0.1^{n},\ge \left (1-e^{-n}\right )^{n}\right ). \end{aligned}

To make sense of the rightmost quantity we can use the approximation

\begin{aligned} (1+\alpha )^{r}\ge 1+r\alpha \end{aligned}

valid for all \alpha \ge -1 and r\ge 1. Thus this gives

\begin{aligned} \left (\le 0.1^{n},\ge 1-ne^{-n}\right ). \end{aligned}

We have ne^{-n}<2^{-n}. Thus this distribution in particular gives equation (??). The bounds on the number of gates and the fan-in holds by inspection. QED

Exercise 6.6. Prove \text { Gap-Maj}_{1/2-1/\sqrt {\log n},1/2+1/\sqrt {\log n}} has alternating circuits of depth c and size n^{c}.

Exercise 6.7. Assume the circuit in Lemma 6.3 is explicit in the sense of Lemma 6.2. Prove Theorem 6.3.

There remains to construct explicit circuits for Gap-Maj. We give a construction which has worse parameters than Lemma 6.3 but is simple and suffices for (1) in Theorem 6.3. The idea is that if the input weight of x is large, then we can find a few shifts of the ones in x that cover each of the n bits. But if the weight of x is small we can’t. By “shift” by s we mean the string x_{i\oplus s}, obtained from x by permuting the indices by xoring them with s. (Other permutations would work just as well.)

Lemma 6.4. Let r:=\log n. The following circuit solves \text {GapMaj}_{1/r^{2},1-1/r^{2}} on every x\in \{0,1\} ^{n}:

\begin{aligned} \bigvee s_{1},s_{2},\ldots ,s_{r}\in \{0,1\} ^{r}:\bigwedge i\in \{0,1\} ^{r}:\bigvee j\in \{1,2,\ldots ,r\}:x_{i\oplus s_{j}}. \end{aligned}

Note that

\begin{aligned} \bigwedge i\in \{0,1\} ^{r}:\bigvee j\in \{1,2,\ldots ,r\}:x_{i\oplus s_{j}} \end{aligned}

means that every bit i is covered by some shift s_{j} of the input x.

ProofAssume |x|\le n/r^{2}. Each shift s_{i} contributes at most n/r^{2} ones. Hence all the r shifts contribute \le n/r ones, and we do not cover every bit i.

Now assume |x|\ge n(1-1/r^{2}). We show the existence of shifts s_{i} that cover every bit by the probabilistic method. Specifically, for a fixed x we pick the shifts uniformly at random and aim to show that the probability that we do not cover every bit is <1. Indeed:

\begin{aligned} \begin{aligned} & \mathbb {P}_{s_{1},s_{2},\ldots ,s_{r}}[\exists i\in \{0,1\} ^{r}:\forall j\in \{1,2,\ldots ,r\}:x_{i\oplus s_{j}}=0]\\ \le & \sum _{i\in \{0,1\} ^{r}}\mathbb {P}_{s_{1},s_{2},\ldots ,s_{r}}[\forall j\in \{1,2,\ldots ,r\}:x_{i\oplus s_{j}}=0] & \text {(union bound)}\\ = & \sum _{i\in \{0,1\} ^{r}}\mathbb {P}_{s}[x_{i\oplus s}=0]^{r} & \text {(independence of the }s_{i}\text {)}\\ \le & \sum _{i\in \{0,1\} ^{r}}(1/r^{2})^{r} & \text {(by assumption on }|x|\text {)}\\ \le & (2/r^{2})^{r}\\ < & 1, \end{aligned} \end{aligned}

as desired. QED

Exercise 6.8. Prove (1) in Theorem 6.3.

Lemma 6.4 is not sufficient for (2) in Theorem 6.3. One can prove (2) by derandomizing the shifts in Lemma 6.4. This means generating their r^{2} bits using a seed of only r\log ^{c}r bits (instead of the trivial r^{2} in Lemma 6.4.).

Exercise 6.9. Prove:

(1) \text {P}=\text {NP}\Rightarrow \text {P}=\text {BPP}.

(2) \Sigma _{2}\text {P}\subseteq \text {BPP}\Rightarrow \text {PH collapses}.

6.4 The quantifier calculus

We have extended \text {P} with \exists and \forall quantifiers. We have also extended it with randomness to obtain \text {BPP}. As alluded to before, we can also think of BPP as a quantifier \text {BP} applied to \text {P}. The Unique-3Sat problem (Theorem 5.8) also points to a new quantifier, “exists unique.” We now develop a general calculus of quantifiers, and examine fundamental relationships between then. For simplicity, we only consider power-time computation.

Definition 6.3. Let C be a class of functions mapping \{0,1\}^* \to \{0,1\} . We define L\in \text {Op}\cdot C if there is L'\in C and d\in \mathbb {N} such that

  • \text {Op}=\text {Maj}
    \begin{aligned} x\in L & \Leftrightarrow \mathbb {P}_{y\in \{0,1\} ^{|x|^{d}}}[(x,y)\in L']\ge 1/2. \end{aligned}
  • \text {Op}=\text {BP}
    \begin{aligned} x\in L & \Rightarrow \mathbb {P}_{y\in \{0,1\} ^{|x|^{d}}}[(x,y)\in L']\ge 2/3,\\ x\not \in L & \Rightarrow \mathbb {P}_{y\in \{0,1\} ^{|x|^{d}}}[(x,y)\in L']\le 1/3. \end{aligned}
  • \text {Op}=\oplus (read: parity)
    \begin{aligned} x\in L\Leftrightarrow \text {there is an odd number of }y\in \{0,1\} ^{|x|^{d}}:(x,y)\in L'. \end{aligned}
  • \text {Op}=\exists
    \begin{aligned} x\in L\Leftrightarrow \exists y\in \{0,1\} ^{|x|^{d}}:(x,y)\in L'. \end{aligned}
  • \text {Op}=\forall
    \begin{aligned} x\in L\Leftrightarrow \forall y\in \{0,1\} ^{|x|^{d}}:(x,y)\in L'. \end{aligned}

With this notation we have: \text {NP}=\exists \cdot \text {P}, \text {BPP}=\text {BP}\cdot \text {P}, \Sigma _{2}\text {P}=\exists \cdot \forall \cdot \text {P}.

6.5 PH is a random low-degree polynomial

In this section we prove the following result.

Theorem 6.4. [37] \text {PH}\subseteq \text {BP}\cdot \oplus \cdot \text {P}.

This is saying that any constant number of \exists and \forall quantifier can be replaced by a BP quantifier followed by a \oplus quantifier. Let’s see what this has to do with the title of this section. Where is the polynomial? Consider polynomials over \mathbb {F}_{2}\in \{0,1\} . Recall that such a polynomial over n bits is an object like

\begin{aligned} p(x_{1},x_{2},\ldots ,x_{n})=x_{1}\cdot x_{2}+x_{3}+x_{7}\cdot x_{2}\cdot x_{1}+x_{2}+1. \end{aligned}

Because we are only interested in inputs in \{0,1\} we have x^{i}=x for any i\ge 1 and any variable x, so we don’t need to raise variables to powers bigger than one.

Example 6.1. The And function on n bits can be written as the polynomial

\begin{aligned} \text {And}(x_{1},x_{2},\ldots ,x_{n})=\prod _{i=1}^{n}x_{i}. \end{aligned}

The Or function on n bits can be written as the polynomial

\begin{aligned} \text {Or}(x_{1},x_{2},\ldots ,x_{n})=1+\text {And}(1+x_{1},1+x_{2},\ldots ,1+x_{n})=1+\prod _{i=1}^{n}(1+x_{i}). \end{aligned}

For n=2 we have

\begin{aligned} \text {Or}(x_{1},x_{2})=x_{1}+x_{2}+x_{1}\cdot x_{2}. \end{aligned}

The polynomial corresponding to a PH computation will have an exponential number of terms, so we can’t write it down. The big sum over all its monomials corresponds to the \oplus in Theorem 6.4. The polynomial will be sufficiently explicit: we will be able to compute each of its monomials in P. Finally, there won’t be just one polynomial, but we will have a distribution on polynomials, and that’s the BP part.

Confusing? Like before, a good way to look at this result is in terms of alternating circuits. We state the basic circuit result behind Theorem 6.4 after a definition. The result is of independent interest and will be useful later.

Definition 6.4. A distribution P on polynomials computes a function f:\zonzo with error \epsilon if for every x we have

\begin{aligned} \mathbb {P}_{P}[P(x)=f(x)]\ge 1-\epsilon . \end{aligned}

Theorem 6.5. Let C:\{0,1\} ^{n}\to \{0,1\} be an alternating circuit of depth d and size s. Then there is a distribution P on polynomials over \mathbb {F}_{2} of degree \log ^{d}s/\epsilon that computes C with error \epsilon .

Ultimately we only need constant error, but the construction requires small error. Jumping ahead, this is because we construct distributions for each gate separately, and we need the error to be small enough for a union bound over all gates in the circuit.

The important point in Theorem 6.5 is that if the depth d is small (e.g., constant) (and the size is not enormous and the error is not too small) then the degree is small as well. For example, for power-size alternating circuits of constant depth the degree is power logarithmic for constant error.

Let us slowly illustrate the ideas behind Theorem 6.5 starting with the simplest case: C is just a single Or gate on n bits.

Lemma 6.5. For every \epsilon and n there is a distribution P on polynomials of degree \log 1/\epsilon in n variables over \mathbb {F}_{2} that computes Or with error \epsilon .

ProofFor starters, pick the following distribution on linear polynomials: For a uniform A=(A_{1},A_{2},\ldots ,A_{n})\in \{0,1\} ^{n} output the polynomial

\begin{aligned} p_{A}(x_{1},x_{2},\ldots ,x_{n}):=\sum _{i}A_{i}\cdot x_{i}. \end{aligned}

Let us analyze how p_{A} behaves on a fixed input x\in \{0,1\} ^{n}:

  • If \text {Or}(x)=0 then p_{A}(x)=0;
  • If \text {Or}(x)=1 then \mathbb {P}_{A}[p_{A}(x)=1]\ge 1/2.

While the error is large in some cases, a useful feature of p_{A} is that it never makes mistakes if \text {Or}(x)=0. This allows us to easily reduce the error by taking t:=\log 1/\epsilon polynomials p_{A} and combining them with an Or.

\begin{aligned} p_{A_{1},A_{2},\ldots ,A_{t}}(x):=p_{A_{1}}(x)\vee p_{A_{2}}(x)\vee \cdots \vee p_{A_{t}}(x). \end{aligned}

The analysis is like before:

  • If \text {Or}(x)=0 then p_{A_{1},A_{2},\ldots ,A_{t}}(x)=0;
  • If \text {Or}(x)=1 then \mathbb {P}_{A_{1},A_{2},\ldots ,A_{t}}[p{}_{A_{1},A_{2},\ldots ,A_{t}}(x)=1]\ge 1-(1/2)^{t}\ge 1-\epsilon .

It remains to bound the degree. Each p_{A_{i}} has degree 1. The Or on t bits has degree t by Example 6.1. Hence the final degree is t=\log 1/\epsilon . QED

Exercise 6.10. Obtain the same result for C=\text {And}.

Now we would like to handle general circuits which have any number of And and Or gates. As mentioned earlier, we apply the construction above to every gate, and compose the polynomials. We pick the error at each gate small enough so that we can do a union bound over all gates.

Proof of Theorem 6.5 We apply Lemma 6.5 to every gate in the circuit with error \epsilon /s. By a union bound, the probability that any gate makes a mistake is \epsilon , as desired.

The final polynomial is obtained by composing the polynomials of each gate. The composition of a polynomial of degree d_{1} with another of degree d_{2} results in a polynomial of degree d_{1}\cdot d_{2}. Since each polynomial has degree \log s/\epsilon , the final degree is \log ^{d}s/\epsilon . QED

6.5.1 Back to PH

We have proved Theorem 6.5 which is a circuit analogue of Theorem 6.4. We now go back to the PH. First, we have to come up with a more explicit description of the polynomials, instead of picking them at random. This is similar to the way we proceeded in section º6.3: After a non-explicit construction (Lemma 6.3) we then obtained an explicit construction (Lemma 6.4).

Let us go back to the simplest case of Or. Recall that the basic building block in the proof of Lemma 6.5 was the construction of a distribution p_{A} on linear polynomials which are zero on the all-zero input (which just means that they do not have constant terms), and are often non-zero on any non-zero input. We introduce a definition, since now we will have several constructions with different parameters.

Definition 6.5. A distribution p_{A} on linear polynomials with no constant term has the Or property if \mathbb {P}_{A}[p_{A}(x)=1]\ge 1/3 for any x\ne 0. We identify p_{A} with the n bits A corresponding to its coefficients.

The next lemma shows that we can compute distributions on linear polynomials with the Or property from a seed of just \log n+c bits, as opposed to the n bits that were used for A in the proof of Lemma 6.5. Recall that for our application to Lemma 6.5 the polynomials have an exponential number of monomials and so we cannot afford to write them down. Instead we shall guarantee that given a seed r and an index to a monomial we can compute the monomial via a function f in P. In this linear case, for a polynomial in