Special Topics in Complexity Theory, Fall 2017. Instructor: Emanuele Viola
1 Lectures 4-5, Scribe: Matthew Dippel
These lectures cover some basics of small-bias distributions, and then a more recent pseudorandom generator for read-once CNF [GMR12].
2 Small bias distributions
Definition 1.[Small bias distributions] A distribution over has bias if no parity function can distinguish it from uniformly random strings with probability greater than . More formally, we have:
In this definition, the is simply the probability of a parity test being or over the uniform distribution. We also note that whether we change the definition to have the probability of the parity test being or doesn’t matter. If a test has probability of being equal to , then it has probability of being , so the bias is independent of this choice.
This can be viewed as a distribution which fools tests that are restricted to computing parity functions on a subset of bits.
Before we answer the important question of how to construct and efficiently sample from such a distribution, we will provide one interesting application of small bias sets to expander graphs.
Theorem 2.[Expander construction from a small bias set] Let be a distribution over with bias . Define as the following graph:
Then, when we take the eigenvalues of the random walk matrix of in descending order , we have that:
Thus, small-bias sets yields expander graphs. Small-bias sets also turn out to be equivalent to constructing good linear codes. Although all these questions have been studied much before the definition of small-bias sets [NN90], the computational perspective has been quite useful, even in answering old questions. For example Ta-Shma used this perspective to construct better codes [Ta-17].
3 Constructions of small bias distributions
Just like our construction of bounded-wise independent distributions from the previous lecture, we will construct small-bias distributions using polynomials over finite fields.
Theorem 1.[Small bias construction] Let be a finite field of size , with elements represented as bit strings of length . We define the generator as the following:
In this notation, a subscript of indicates taking the th bit of the representation. Then the output of over uniform and has bias .
Proof. Consider some parity test induced by a subset . Then when applied to the output of , it simplifies as:
Note that is the evaluation of the polynomial at the point . We note that if , then the value of is equally likely to be or over the probability of a uniformly random . This follows from the fact that the inner product of any non-zero bit string with a uniformly random bit string is equally likely to be or . Hence in this case, our generator has no bias.
In the case where , then the inner product will always be , independent of the value of . In these situations, the bias is , but this is conditioned on the event that .
We claim that this event has probability . Indeed, for non empty , is a polynomial of degree . Hence it has at most roots. But we are selecting from a field of size . Hence the probability of picking one root is .
Hence overall the bias is at most .
To make use of the generator, we need to pick a specific . Note that the seed length will be . If we want to achieve bias , then we must have . Al the logarithms in this lecture are in base . This gives us a seed length of .
Small-bias are so important that a lot of attention has been devote to optimizing the constant “2” above. A lower bound of on the seed length was known. Ta-Shma recently [Ta-17] gave a nearly matching construction with seed length .
We next give a sense of how to obtain different tradeoffs between and in the seed length. We specifically focus on getting a nearly optimal dependence on , because the construction is a simple, interesting “derandomization” of the above one.
3.1 An improved small bias distribution via bootstrapping
We will show another construction of small bias distributions that achieves seed length . It will make use of the previous construction and proof.
The intuition is the following: the only time we used that was uniform was in asserting that if , then is uniform. But we don’t need to be uniform for that. What do we need from ? We need that it has small-bias!
Our new generator is where and are as before but with different parameters. For , we pick of length , whereas just needs to be an -biased generator on bits, which can be done as we just saw with bits. This gives a seed length of , as promised.
We can of course repeat the argument but the returns diminish.
4 Connecting small bias to k-wise independence
We will show that using our small bias generators, we can create distributions which are almost -wise independent. That is, they are very close to a -wise independent distribution in statistical distance, while having a substantially shorter seed length than what is required for -wise independence. In particular, we will show two results:
- Small bias distributions are themselves close to -wise independent.
- We can improve the parameters of the above by feeding a small bias distribution to the generator for -wise independence from the previous lectures. This will improve the seed length of simply using a small bias distribution.
Before we can show these, we’ll have to take a quick aside into some fundamental theorems of Fourier analysis of boolean functions.
4.1 Fourier analysis of boolean functions 101
Let . Here the switch between and is common, but you can think of them as being isomorphic. One way to think of is as being a vector in . The th entry of indicates the value of . If we let be the indicator function returning iff , but once again written as a vector like is, then any function can be written over the basis of the vectors, as:
This is the “standard” basis.
Fourier analysis simply is a different basis in which to write functions, which is sometimes more useful. The basis functions are . Then any boolean function can be expressed as:
where the , called the “Fourier coefficients,” can be derived as:
where the expectation is over uniformly random .
Claim 1. For any function with range , its Fourier coefficients satisfy:
Proof. We know that , as squaring the function makes it . We can re-express this expectation as:
We make use of the following fact: if , then . If they equal each other, then their difference is the empty set and this function is .
Overall, this implies that the above expectation can be simply rewritten as:
Since we already decided that the expectation is , the claim follows.
5 Small bias distributions are close to -wise independent
Before we can prove our claim, we formally introduce what we mean for two distributions to be close. We use the most common definition of statistical difference, which we repeat here:
Definition 1. Let be two distributions over the same domain . Then we denote their statistical distance , and sometimes written as , as
Note that the probabilities are with respect to the individual distributions and . We may also say that is -close to if .
We can now show our result, which is known as Vazirani’s XOR Lemma:
Theorem 2. If a distribution over has bias , then is close to the uniform distribution.
Proof. Let be a test. To fit the above notation, we can think of as being defined as the set of inputs for which . Then we want to bound:
Expanding in Fourier basis we rewrite this as
We know that for all non empty , and when is the empty set. We also know that for all non empty , and is when is the empty set. So the above can be bounded as:
Proof. By Cauchy Schwartz:
Where the last simplification follows from Claim 1.
Using the above lemma completes the upper bound and the proof of the theorem.
Corollary 4. Any bits of an biased distribution are close to uniform.
Using the corollary above, we see that we can get close to a -wise independent distribution (in the sense of the corollary) by taking a small bias distribution with . This requires seed length Recall that for exact -wise we required seed length .
5.1 An improved construction
Theorem 5. Let be the generator previously described that samples a -wise independent distribution (or any linear ). If we replace the input to with a small bias distribution of , then the output of is -close to being -wise independent.
Proof. Consider any parity test on bits on the output of . It can be shown that is a linear map, that is, simply takes its seed and it multiplies it by a matrix over the field GF(2) with two elements. Hence, corresponds to a test on the input of , on possibly many bits. The test is not empty because is -wise independent. Since we fool with error , we also fool with error , and the theorem follows by Vazirani’s XOR lemma.
Using the seed lengths we saw we get the following.
Corollary 6. There is a generator for almost -wise independent distributions with seed length .
6 Tribes Functions and the GMRTV Generator
We now move to a more recent result. Consider the Tribes function, which is a read-once CNF on bits, given by the And of terms, each on bits. You should think of where and .
We’d like a generator for this class with seed length . This is still open! (This is just a single function, for which a generator is trivial, but one can make this challenge precise for example by asking to fool the Tribes function for any possible negation of the input variables. These are tests and a generator with seed length is unknown.)
The result we saw earlier about fooling And gives a generator with seed length , however the dependence on is poor. Achieving a good dependence on has proved to be a challenge. We now describe a recent generator [GMR12] which gives seed length . This is incomparable with the previous , and in particular the dependence on is always suboptimal. However, when the generator [GMR12] gives seed length which is better than previously available constructions.
The high-level technique for doing this is based on iteratively restricting variables, and goes back about 30 years [AW89]. This technique seems to have been abandoned for a while, possibly due to the spectacular successes of Nisan [Nis91, Nis92]. It was revived in [GMR12] (see also [GLS12]) with an emphasis on a good dependence on .
A main tool is this claim, showing that small-bias distributions fool products of functions with small variance. Critically, we work with non-boolean functions (which later will be certain averages of boolean functions).
Claim 1. Let be a series of boolean functions. Further, let be an -biased distribution over bits, where each is bits long. Then
where is variance of with respect to the uniform distribution.
This claim has emerged from a series of works, and this statement is from a work in progress with Chin Ho Lee. For intuition, note that constant functions have variance , in which case the claim gives good bounds (and indeed any distribution fools constant functions). By contrast, for balanced functions the variance is constant, and the sum of the variances is about , and the claim gives nothing. Indeed, you can write Inner Product as a product of nearly balanced functions, and it is known that small-bias does not fool it. For this claim to kick in, we need each variance to be at most .
In the tribes function, the And fucntions have variance , and the sum of the variances is about and the claim gives nothing. However, if you perturb the Ands with a little noise, the variance drops polynomially, and the claim is useful.
Claim 2. Let be the AND function on bits. Rewrite it as , where . That is, we partition the input into two sets. Define as:
where is uniform. Then
We know that is simply the expected value of , and since is the AND function, this is , so the right term is .
We reexpress the left term as . But we note that this product is iff . The probability of this happening is .
Thus the final difference is .
We’ll actually apply this claim to the Or function, which has the same variance as And by De Morgan’s laws.
We now present the main inductive step to fool tribes.
Claim 3. Let be the tribes function, where the first bits of each of the terms are fixed. Let be the free bits per term, and the number of terms that are non-constant (some term may have become after fixing the bits).
Reexpress as , where each term’s input bits are split in half, so .
Let be a small bias distribution with bias (for a big enough to be set later). Then
That is, if we replace half of the free bits with a small bias distribution, then the resulting expectation of the function only changes by a small amount.
To get the generator from this claim, we repeatedly apply Claim 3, replacing half of the bits of the input with another small bias distribution. We repeat this until we have a small enough remaining amount of free bits that replacing all of them with a small bias distribution causes an insignificant change in the expectation of the output.
At each step, is cut in half, so the required number of repetitions to reduce to constant is . Actually, as explained below, we’ll stop when for a suitable constant (this arises from the error bound in the claim above, and we).
After each replacement, we incur an error of , and then we incur the final error from replacing all bits with a small bias distribution. This final error is negligible by a result which we haven’t seen, but which is close in spirit to the proof we saw that bounded independence fools AND.
The total accumulated error is then . If we wish to achieve a specific error , we can run each small bias generator with .
At each iteration, our small bias distribution requires bits, so our final seed length is .
Proof of Claim 3. Define , and rewrite our target expression as:
This is in the form of Claim 1. We also note that from Claim 2 that .
We further assume that . For if this is not true, then the expectation over the first terms is , because of the calculation
Then we can reason as in the proof that bounded independence fools AND (i.e., we can run the argument just on the first terms to show that the products are close, and then use the fact that it is small under uniform, and the fact that adding terms only decreases the probability under any distribution).
Under the assumption, we can bound the sum of the variances of as:
If we assume that then this sum is .
We can then plug this into the bound from Claim 1 to get
Now we set so that , and the bound becomes:
By making large enough the claim is proved.
In the original paper, they apply these ideas to read-once CNF formulas. Interestingly, this extension is more complicated and uses additional ideas. Roughly, the progress measure is going to be number of terms in the CNF (as opposed to the width). A CNF is broken up into a small number of Tribes functions, the above argument is applied to each Tribe, and then they are put together using a general fact that they prove, that if and are fooled by small-bias then also on disjoint inputs is fooled by small-bias.
[AW89] Miklos Ajtai and Avi Wigderson. Deterministic simulation of probabilistic constant-depth circuits. Advances in Computing Research – Randomness and Computation, 5:199–223, 1989.
[GLS12] Dmitry Gavinsky, Shachar Lovett, and Srikanth Srinivasan. Pseudorandom generators for read-once accˆ0. In Proceedings of the 27th Conference on Computational Complexity, CCC 2012, Porto, Portugal, June 26-29, 2012, pages 287–297, 2012.
[GMR12] Parikshit Gopalan, Raghu Meka, Omer Reingold, Luca Trevisan, and Salil Vadhan. Better pseudorandom generators from milder pseudorandom restrictions. In IEEE Symp. on Foundations of Computer Science (FOCS), 2012.
[Nis91] Noam Nisan. Pseudorandom bits for constant depth circuits. Combinatorica. An Journal on Combinatorics and the Theory of Computing, 11(1):63–70, 1991.
[Nis92] Noam Nisan. Pseudorandom generators for space-bounded computation. Combinatorica, 12(4):449–461, 1992.
[NN90] J. Naor and M. Naor. Small-bias probability spaces: efficient constructions and applications. In 22nd ACM Symp. on the Theory of Computing (STOC), pages 213–223. ACM, 1990.
[Ta-17] Amnon Ta-Shma. Explicit, almost optimal, epsilon-balanced codes. In ACM Symp. on the Theory of Computing (STOC), pages 238–251, 2017.