In these lectures we conclude the proof that bounded independence fools the AND function, and look at the more recent result that bounded independence fools the circuit class .
Theorem 1. Let be a distribution over such that any are independent (but not necessarily uniform). Then, we have that
Proof. Let be the distribution of . Let be the -wise independent distribution such that for all and the are independent. The theorem is equivalent to the following statement.
We will prove the above statement by the following version of the Inclusion-Exclusion principle.
Let be any distribution over . Note that by De Morgan’s laws, we have
Let be the event that . We want to bound the quantity . By looking at the Venn diagram of the events , we can see that
and so on. In general, we have the following. Define
Then, we have the bounds for odd , and for even . This fact holds for any distribution.
Let us return to the proof. Note that the are the same for and up to because they only involve sums of ANDs of at most events. Hence, we have that
where the last equality comes from the definition of . Therefore, we are done if . We have that
where . To bound the expectation we recall a useful inequality.
Let be non-negative real numbers. Then, by the AM-GM inequality, we have that
Consider the following more general statement,
and note that the left most term is equal to , while the right most term is equal to
Applying the above inequality to and a common approximation for the binomial coefficient, we have that
Therefore, we are done if . Recall that . So if is small then is close to .
It remains to handle the case that . Pick such that
By the previous argument, the AND of the first is the same up to for and . Also, for every distribution the probability of that the And of bits is 1 is at most the probability that the And of bits is . And also, for the -wise independent distribution we have
The combination of these facts concludes this case. To summarize, in this case we showed that
as well as
By the choice of and the previous argument, we also know that and so we are done, as all quantities above are at most (and at least ).
Remark 2. The bound is tight up to
Proof. Let be the distribution over as follows: and . Then, is -wise independent. However, if is even, then
Yet, we have that
Acknowledgement. This section is based on Amnon Ta-Shma’s notes for the class 0368.4159 Expanders, Pseudorandomness and Derandomization CS dept, Tel-Aviv University, Fall 2016.
Note that a DNF on bits can be modeled as a depth two circuit where the top layer is an OR-gate whose inputs are AND-gates, which take inputs and their negations. The circuit class can be viewed as a generalization of this to higher (but constant) depth circuits. That is, consists of circuits using AND-gates, OR-gates, NOT-gates, and input registers. Each of the gates have unbounded fan-in (i.e. the number of input wires). The size of the circuit is defined to be the number of gates.
Conjecture 3.[Linial-Nisan[LN90]] -wise independence fools circuits of depth and size .
The conjecture was open for a long time, even for in the special case . In 2007 a breakthrough work by Bazzi [Baz09] proved it for . Shortly afterwards, Razborov presented a simpler proof of Bazzi’s result [Raz09], and Braverman proved the conjecture for any with -wise independence [Bra10]. Tal improved the result to [Tal17].
Interestingly, the progress on the conjecture does not use ideas that were not around since the time of its formulation. Bottom line: if a problem is open for a long time, you should immediately attack it with existing tools.
The high-level intuition why such a result should be true is the following:
- is approximated by polylog degree polynomials.
- -wise independence fools degree- polynomials.
Proof of (2). Let . Let be a degree polynomial over . Write as
If is a -wise independent distribution on , then by linearity of expectation
There are several notions of approximating AC by low-degree polynomials. We now review two of them, explaining why neither of them is sufficient. Braverman showed how to cleverly combine the two methods to prove a version of (1) that’s strong enough.
Theorem 4. For all circuits of size and depth , for all distributions over , for all , there exists a polynomial of degree such that
The important features of this approximation are that it works under any distribution, and when the polynomial is correct it outputs a boolean value.
Similar approximations appear in many papers, going back to Razborov’s paper [Raz87] (who considers polynomials modulo 2) which uses ideas from earlier still work.
Note that the polynomial depends on the circuit chosen, and on the distribution. This theorem is not a good enough approximation because on the fraction of inputs where the polynomial and circuit are unequal, the value of the polynomial can (and does) explode to be much greater than . This prevents us from bounding the average of the polynomial.
Nevertheless, let us prove the above theorem.
Proof. Consider one OR-gate of fan-in . We construct a distribution of polynomials that compute any input with high probability. This implies that there is a fixed polynomial that computes the circuit on a large fraction of the inputs by an averaging argument.
For . let be a random subset of where every element is included with probability , independently.
Suppose has Hamming weight . Then, . And the sum can be shown to equal with constant probability.
Define the approximation polynomial to be
Note that if has weight , then with constant probability. If , then with probability . We can adjust the error probability to by repeating each term in the product times.
Thus, we can approximate one gate with the above polynomial of degree . Construct polynomials as above for each gate, with error parameter . The probability that any of them is wrong is at most by a union bound. To obtain the approximating polynomial for the whole circuit compose all the polynomials together. Since the circuit is of depth , the final degree of the approximating polynomial is , as desired.
As mentioned at the beginning, this is a distribution on polynomials that computes correctly any input with probability at least . By averaging, there exists a fixed polynomial that computes correctly a fraction of inputs.
It can be verified that the value of the polynomial can be larger than . The polynomial for the gates closest to the input can be as large as . Then at the next level it can be as large as , which is already much larger than .
Theorem 5. For all circuits of size and depth , for all error values , there exists a polynomial of degree such that
The important feature of this approximation is that it bounds the average, but only under the uniform distribution. Because it does not provide any guarantee on other distributions, including -wise independent distributions, it cannot be used directly for our aims.
Remark 6. Approximation 2 is proved via the switching lemma, an influential lemma first proved in the early 80’s by Ajtai [Ajt83] and by Furst, Saxe, and Sipser [FSS84]. The idea is to randomly set a subset of the variables to simplify the circuit. You can do this repeatedly to simplify the circuit even further, but it only works on the uniform distribution. Hastad [Hås87] gave a much tighter analysis of the switching lemma, and the paper [LMN93] used it to prove a version of Approximation 2 with a slightly worse dependence on the error. Recently, a refinement of the switching lemma was proved in [Hås14, IMP12]. Based on that, Tal [Tal17] obtained the corresponding refinement of Approximation 2 where the parameters are as stated above. (The polynomial is simply obtained from the Fourier expansion of the function computed by the circuit by removing all Fourier coefficients larger than a certain threshold. The bound on the Fourier decay in [Tal17] implies the desired approximation.)
Theorem 7. For all circuits with unbounded fan-in of size and depth , for all error values , for all -wise independent distributions on , we have that
Corollary 8. In particular, if , , , then suffices.
Claim 9. For all circuits with unbounded fan-in of size and depth , for all error values , for all -wise independent distributions on , there is a set of inputs, and a degree- polynomial such that:
- is ’rare’ under both and :
, and . Here we write for the indicator function of the event .
- For all , . Here is the logical Or.
We only need (1) under , but (1) under is used to prove (3).
We can construct the polynomial approximation from Claim 9 by using a combination of Approximation 1 and 2. First we need a little more information about Approximation 1.
Claim 10. Two properties of approximation 1:
Note that the term is incorrect exactly when the input has weight but all the sets intersect or ones. This can be checked in AC, in parallel for all gates in the circuit.
Proof of Claim 9. Run approximation 1 for the distribution , yielding the polynomial and the set . This already proves the first part of the claim for both and , because if has probability under it has probability under , and the same for . Use Claim 10 part 2, and run approximation 2 on . Call the resulting polynomial , which has degree with error bound .
The idea in the ultimate approximating polynomial is to “check if there is a mistake, and if so, output . Otherwise, output ”. Formally:
Claim 9 part 2 can be shown as follows. by definition. So, if , then we are done. Otherwise, . So there is no mistake, and . Hence, by the properties of Approximation 1, . This implies .
It only remains to show Claim 9 part 3:
By part 1 of Claim 9,
We can show that this equals
by the following argument: If then by definition. If , then there is no mistake, and . This implies that .
Let us rewrite the above expression in terms of the expectation norm.
Recall the triangle inequality, which states: . Therefore, letting we have that the above quantity is
To conclude, we will show that each of the above terms are . Note that
For this quantity to be at most we set . Here we critically set the error in Approximation 2 much lower, to cancel the large values arising from Approximation 1. By Theorem 5, the polynomial arising from approximation 2 has degree .
Finally, let us bound the other term, . If , then the distance is . If , then the distance is . Therefore, this term is at most , which we know to be at most .
[Raz87] Alexander Razborov. Lower bounds on the dimension of schemes of bounded depth in a complete basis containing the logical addition function. Akademiya Nauk SSSR. Matematicheskie Zametki, 41(4):598–607, 1987. English translation in Mathematical Notes of the Academy of Sci. of the USSR, 41(4):333-338, 1987.