One of the earliest (and coolest) results in pseudorandomness is the following theorem that shows that -wise independence “fools” the And function.

**Theorem 1.** [1] Let be -wise independent distributions over . Then

Proof. First recall that the inclusion-exclusion principle shows that for any random variables jointly distributed over according to a distribution we have (write Or for the Or function on bits);

where

Moreover, if we truncate the sum in Equation (1) to the first terms, it gives either a lower bound or an upper bound depending on whether is odd or even.

Because this holds under any distribution, if we show that the right-hand side of Equation (1) is approximately the same if the are -wise independent and -wise, then the left-hand sides of that equation will also be approximately the same in the two scenarios, giving the result. (Note that equals the expectation of the product of the if the latter are -wise independent.)

Now, since the terms only involve expectations of the product of at most variables, we have that they are the same under -wise and -wise independence up to . This would conclude the argument if we can show that is (since this is the quantity that you need to add to go from a lower/upper bound to an upper/lower bound). This is indeed the case if because then by McLaurin’s inequality we have

where the first equality holds because the are -wise independent.

There remains to handle the case where . In this case, the expectations are so small that even running the above argument over a subset of the random variables is enough. Specifically, let be such that (more formally you can find an such that this holds up to an additive , which is enough for the argument; but I will ignore this for simplicity). Then the above argument still applies to the first variables. Moreover, the expectation of the product of just these variables is already fairly small. Specifically, because the geometric mean is always less than the arithmetic mean (a fact which is closely related to McLaurin’s inequality), we have:

Since under any distribution the expectation of the product of the first variables is at most the expectation of the product of the first , the result follows.

An interesting fact is that this theorem is completely false if the variables are over instead of , even if the independence is . The counterexample is well-known: parity. Specifically, take uniform variables and let . What may be slightly less known is that this parity counterexample also shows that the error term in the theorem is tight up to the constant in the . This is because you can write any function on bits as the disjoint union of rectangles.

A side aim of this post is to try a new system I put together to put math on wordpress. I’ll test it some more and then post about it later, comparing it with the alternatives. Hopefully, my thinking that it was the lack of this system that prevented me from writing math-heavier posts was not just an excuse to procrastinate.

### References

[1] Guy Even, Oded Goldreich, Michael Luby, Noam Nisan, and Boban Velickovic. Approximations of general independent distributions. In ACM Symp. on the Theory of Computing (STOC), pages 10–16, 1992.

Isn’t this a simpler proof: Use triangle inequality and the fact that 0 <= AND(x_1 … x_n) <= AND(x_1 .. x_k).

| E[\prod_i^n X_i] – E[\prod_i^n U_i] | <= | E[\prod_i^k X_i] | + | E[\prod_i^n U_i] | = 2^-k + 2^-n

Your derivation is for the case in which the expectations are half. That is not the general or the most interesting case.

Thanks. I see. That was not clear.