With a historic effort, the residents of Newton MA have collected in a very short time 6,000+ signatures thanks to which a forthcoming ballot will include a question on banning recreational Marijuana sales in the city. (For background see the previous post and comments.)
There are many classes of functions on bits that we know are fooled by bounded independence, including small-depth circuits, halfspaces, etc. (See this previous post.)
On the other hand the simple parity function is not fooled. It’s easy to see that you require independence at least . However, if you just perturb the bits with a little noise , then parity will be fooled. You can find other examples of functions that are not fooled by bounded independence alone, but are if you just perturb the bits a little.
In  we proved that any distribution with independence about fools space-bounded algorithms, if you perturb it with noise. We asked, both in the paper and many people, if the independence could be lowered. Forbes and Kelley have recently proved  that the independence can be lowered all the way to , which is tight . Shockingly, their proof is nearly identical to !
This exciting result has several interesting consequences. First, we now have almost the same generators for space-bounded computation in a fixed order as we do for any order. Moreover, the proof greatly simplifies a number of works in the literature. And finally, an approach in  to prove limitations for the sum of small-bias generators won’t work for space (possibly justifying some optimism in the power of the sum of small-bias generators).
My understanding of all this area is inseparable from the collaboration I have had with Chin Ho Lee, with whom I co-authored all the papers I have on this topic.
Let be a function. We want to show that it is fooled by , where has independence , is the noise vector of i.i.d. bits coming up with probability say , and is bit-wise XOR.
The approach in  is to decompose as the sum of a function with Fourier degree , and a sum of functions where has no Fourier coefficient of degree less than , and and are bounded. The function is immediately fooled by , and it is shown in  that each is fooled as well.
To explain the decomposition it is best to think of as the product of functions on bits, on disjoint inputs. The decomposition in  is as follows: repeatedly decompose each in low-degree and high-degree . To illustrate:
This works, but the problem is that even if each time has degree , the function increases the degree by at least per decomposition; and so we can afford at most decompositions.
The decomposition in  is instead: pick to be the degree part of , and are all the Fourier coefficients which are non-zero in the inputs to and whose degree in the inputs of is . The functions can be written as , where is the high-degree part of and is .
Once you have this decomposition you can apply the same lemmas in  to get improved bounds. To handle space-bounded computation they extend this argument to matrix-valued functions.
In  we asked for tight “bounded independence plus noise” results for any model, and the question remains. In particular, what about high-degree polynomials modulo ?
Former EPA chief’s
resignation confession-of-faith letter according to Breitbart (a website I didn’t know but that I started consulting semi-regularly):
It has been an honor to serve you in the Cabinet as Administrator of the EPA. Truly your confidence in me has blessed me personally and enabled me to advance your agenda beyond what anyone anticipated at the beginning of your administration. Your current steadfastness and resolute commitment to get results for the American people both with regard to improved environmental obstacles and historical regulatory reform is a fact occurring at an unprecedented pace and I thank you for the opportunity to serve you and the American people in helping to achieve those ends. That is why it is hard for me to advise you I am stepping down as administrator of the EPA as of July 6. It is extremely difficult for me to cease serving you in this role, first because I count it as a blessing to be serving you in any capacity, but also because of the transformative work that is occurring; however, the unrelenting attacks on me personally, my family are unprecedented and have taken a sizable toll on all of us. My desire in service to you has always been to bless you as you make important decisions for the American people. I believe you are serving as president today because of God’s providence. I believe that same providence brought me in to your service. I pray as I have served you that I have blessed you and enabled you to effectively lead the American people. Thank you again Mr. President for the honor of serving you and I wish you Godspeed in all that you put your hand to.
It has been an honor to serve you in the Cabinet as Administrator of the EPA. Truly, your confidence in me has blessed me personally and enabled me to advance your agenda beyond what anyone anticipated at the beginning of your Administration. Your courage, steadfastness and resolute commitment to get results for the American people, both with regard to improved environmental outcomes as well as historical regulatory reform, is in fact occurring at an unprecedented pace and I thank you for the opportunity to serve you and the American people in helping achieve those ends.
That is why it is hard for me to advise you I am stepping down as Administrator of the EPA effective as of July 6. It is extremely difficult for me to cease serving you in this role first because I count it a blessing to be serving you in any capacity, but also, because of the transformative work that is occurring. However, the unrelenting attacks on me personally, my family, are unprecedented and have taken a sizable toll on all of us.
My desire in service to you has always been to bless you as you make important decisions for the American people. I believe you are serving as President today because of God’s providence. I believe that same providence brought me into your service. I pray as I have served you that I have blessed you and enabled you to effectively lead the American people. Thank you again Mr. President for the honor of serving you and I wish you Godspeed in all that you put your hand to.
The letter also makes me think that I should have added “to worship God” to this list.
If you are a resident of Newton, MA, sign this petition.
In 2016 Massachusetts voters voted to legalize Marijuana. Except they didn’t know what they were voting for! In Colorado and Washington, the question of legalization and commercialization were completely separate. The marijuana industry apparently learned from that and rigged the Massachusetts ballot question so that a voter legalizing marijuana would also be mandating communities to open marijuana stores. For Newton, MA, this means at least 8 stores. When voters were recently polled, it became clear that the vast majority did not know that this was at stake, and that the majority of them in fact does not want to open marijuana stores in their communities. For example, when I voted I didn’t know that this was at stake. Read the official Massachusetts document to inform voters, see especially the summary on pages 12-13. There is no hint that a community would be mandated by state law to open marijuana stores unless it goes through an additional legislative crusade. Instead it says that communities can choose. I think I even read the summary back then.
Now to avoid opening stores in Newton, MA, we need a new ballot question. The City Council could have put this question on the ballot easily, but a few days ago decided that it won’t by a vote of 13 to 8. You can find the list of names of councilors and how they voted here.
Note that the council was not deciding whether or not to open stores, it was just deciding whether or not we should have a question about this on the ballot.
Instead now we are stuck doing things the hard way. To put this question on the ballot, we need to collect 6000 signatures, or 9000 if the city is completely uncooperative, a possibility which now unfortunately cannot be dismissed.
However we must do it, for the alternative is too awful. Most of the surrounding towns (Wellesley, Weston, Needham, Dedham, etc.) have already opted out. So if Newton opens stores, it basically becomes the hub for west suburban marijuana users, at least some of whom would drive under the influence of marijuana (conveniently undetectable). Proposed store locations include sites on the way to elementary schools, and there is an amusing proposal to open a marijuana store in a prime Newton Center Location, after Peet’s Coffee moves out (they lost the bid for renewal of the lease). The owners of the space admit that people have asked them for a small grocery store instead, but they think that a marijuana store would bring more traffic and business to Newton Center. I told them to open a gym instead. That too would bring traffic and business, but in addition it would have other benefits that cannabis does not have.
This is the post about l2w version 1.0, a Latex to WordPress converter painstakingly put together by me with big help from the LaTeX community. Click here to download it. Below is an example of what you can do, taken at random from my class notes which were compiled with this script. I also used this in conjunction with Lyx for several posts such as I believe P=NP, so you can also call this a Lyx to WordPress converter. I just export to latex and then run l2w.
This might work out of the box. More in detail, it needs tex4ht (which is included e.g. in MiKTeX distributions) and Perl (the script only uses minimalistic, shell perl commands). Simply unzip l2w.zip, which contains four files. The file post.tex is this document, which you can edit. To compile, run l2w.bat (which calls myConfig5.cfg). This will create the output post.html which you can copy and past in the wordpress HTML editor. I have tested it on an old Windows XP machine, and a more recent Windows 7 with MixTeX 2.9. I haven’t tested it on linux, which might require some simple changes to l2w.bat. For LyX I add certain commands in the preamble, and as an example the .lyx source of the post I believe P=NP is included in the zip archive.
The non-math source is compiled using full-fledged LaTeX, which means you can use your own macros and bibliography. The math source is not compiled, but more or less left as is for wordpress, which has its own LaTeX interpreter. This means that you can’t use your own macros in math mode. For the same reason, label and ref of equations are a problem. To make them work, the script fetches their values from the .aux file and then crudely applies them. This is a hack with a rather unreadable script; however, it works for me. One catch: your labels should start with eq:.
I hope this will spare you the enormous amount of time it took me to arrive to this solution. Let me know if you use it!
First, some of the problematic math references:
Lemma 1. Suppose that distributions over are -wise indistinguishable distributions; and distributions over are -wise indistinguishable distributions. Define over as follows:
: draw a sample from , and replace each bit by a sample of (independently).
Then and are -wise indistinguishable.
To finish the proof of the lower bound on the approximate degree of the AND-OR function, it remains to see that AND-OR can distinguish well the distributions and . For this, we begin with observing that we can assume without loss of generality that the distributions have disjoint supports.
Claim 2. For any function , and for any -wise indistinguishable distributions and , if can distinguish and with probability then there are distributions and with the same properties (-wise indistinguishability yet distinguishable by ) and also with disjoint supports. (By disjoint support we mean for any either or .)
Proof. Let distribution be the “common part” of and . That is to say, we define such that multiplied by some constant that normalize into a distribution.
Then we can write and as
where , and are two distributions. Clearly and have disjoint supports.
Then we have
Therefore if can distinguish and with probability then it can also distinguish and with such probability.
Similarly, for all such that , we have
Hence, and are -wise indistinguishable.
Theorem 3. AND-OR.
Proof. Let be -wise indistinguishable distributions for AND with advantage , i.e. . Let be -wise indistinguishable distributions for OR with advantage . By the above claim, we can assume that have disjoint supports, and the same for . Compose them by the lemma, getting -wise indistinguishable distributions . We now show that AND-OR can distinguish :
- : First sample . As there exists a unique such that , . Thus by disjointness of support . Therefore when sampling we always get a string with at least one “”. But then “” is replaced with sample from . We have , and when , AND-OR.
- : First sample , and we know that with probability at least . Each bit “” is replaced by a sample from , and we know that by disjointness of support. Then AND-OR.
Therefore we have AND-OR.
Definition 4. The surjectivity function SURJ, which takes input where for all , has value if and only if .
First, some history. Aaronson first proved that the approximate degree of SURJ and other functions on bits including “the collision problem” is . This was motivated by an application in quantum computing. Before this result, even a lower bound of had not been known. Later Shi improved the lower bound to , see [AS04]. The instructor believes that the quantum framework may have blocked some people from studying this problem, though it may have very well attracted others. Recently Bun and Thaler [BT17] reproved the lower bound, but in a quantum-free paper, and introducing some different intuition. Soon after, together with Kothari, they proved [BKT17] that the approximate degree of SURJ is .
We shall now prove the lower bound, though one piece is only sketched. Again we present some things in a different way from the papers.
For the proof, we consider the AND-OR function under the promise that the Hamming weight of the input bits is at most . Call the approximate degree of AND-OR under this promise AND-OR. Then we can prove the following theorems.
Theorem 6. AND-OR for some suitable .
In our settings, we consider . Theorem 5 shows surprisingly that we can somehow “shrink” bits of input into bits while maintaining the approximate degree of the function, under some promise. Without this promise, we just showed in the last subsection that the approximate degree of AND-OR is instead of as in Theorem 6.
Proof of Theorem 5. Define an matrix s.t. the 0/1 variable is the entry in the -th row -th column, and iff . We can prove this theorem in following steps:
- SURJAND-OR under the promise that each row has weight ;
- let be the sum of the -th column, then AND-OR under the promise that each row has weight , is at least AND-OR under the promise that ;
- AND-OR under the promise that , is at least AND-OR;
- we can change “” into “”.
Now we prove this theorem step by step.
- Let be a polynomial for SURJ, where . Then we have
Then the polynomial for AND-OR is the polynomial with replaced as above, thus the degree won’t increase. Correctness follows by the promise.
- This is the most extraordinary step, due to Ambainis [Amb05]. In this notation, AND-OR becomes the indicator function of . Define
Clearly it is a good approximation of AND-OR. It remains to show that it’s a polynomial of degree in ’s if is a polynomial of degree in ’s.
Let’s look at one monomial of degree in : . Observe that all ’s are distinct by the promise, and by over . By chain rule we have
By symmetry we have , which is linear in ’s. To get , we know that every other entry in row is , so we give away row , average over ’s such that under the promise and consistent with ’s. Therefore
In general we have
which has degree in ’s. Therefore the degree of is not larger than that of .
- Note that , . Hence by replacing ’s by ’s, the degree won’t increase.
- We can add a “slack” variable , or equivalently ; then the condition actually means .
Proof idea for Theorem 6. First, by the duality argument we can verify that if and only if there exists -wise indistinguishable distributions such that:
- can distinguish ;
- and are supported on strings of weight .
Claim 7. OR.
The proof needs a little more information about the weight distribution of the indistinguishable distributions corresponding to this claim. Basically, their expected weight is very small.
Now we combine these distributions with the usual ones for And using the lemma mentioned at the beginning.
What remains to show is that the final distribution is supported on Hamming weight . Because by construction the copies of the distributions for Or are sampled independently, we can use concentration of measure to prove a tail bound. This gives that all but an exponentially small measure of the distribution is supported on strings of weight . The final step of the proof consists of slightly tweaking the distributions to make that measure .
Groups have many applications in theoretical computer science. Barrington [Bar89] used the permutation group to prove a very surprising result, which states that the majority function can be computed efficiently using only constant bits of memory (something which was conjectured to be false). More recently, catalytic computation [BCK14] shows that if we have a lot of memory, but it’s full with junk that cannot be erased, we can still compute more than if we had little memory. We will see some interesting properties of groups in the following.
Some famous groups used in computer science are:
- with bit-wise addition;
- with addition mod ;
- , which are permutations of elements;
- Wreath product , whose elements are of the form where is a “flip bit”, with the following multiplication rules:
- in ;
- is the operation;
An example is . Generally we have
- matrices over with determinant in other words, group of matrices such that .
The group was invented by Galois. (If you haven’t, read his biography on wikipedia.)
Quiz. Among these groups, which is the “least abelian”? The latter can be defined in several ways. We focus on this: If we have two high-entropy distributions over , does has more entropy? For example, if and are uniform over some elements, is close to uniform over ? By “close to” we mean that the statistical distance is less that a small constant from the uniform distribution. For , if uniform over , then is the same, so there is not entropy increase even though and are uniform on half the elements.
Definition 8.[Measure of Entropy] For , we think of for “high entropy”.
Note that is exactly the “collision probability”, i.e. . We will consider the entropy of the uniform distribution as very small, i.e. . Then we have
where is the minimum dimension of irreducible representation of .
By this theorem, for high entropy distributions and , we get , thus we have
If is large, then is very close to uniform. The following table shows the ’s for the groups we’ve introduced.
|should be very small|
Here is the alternating group of even permutations. We can see that for the first groups, Equation ((2)) doesn’t give non-trivial bounds.
But for we get a non-trivial bound, and for we get a strong bound: we have .
[BCK14] Harry Buhrman, Richard Cleve, Michal Koucký, Bruno Loff, and Florian Speelman. Computing with a full memory: catalytic space. In ACM Symp. on the Theory of Computing (STOC), pages 857–866, 2014.
To dismantle environmental regulations in exchange for gifts.
To channel taxpayers’ money towards a luxurious and extravagant lifestyle for its administrator.
To repudiate its own mission.
All of the above.
To protect human health and the environment.
The Italian newspaper Il fatto quotidiano just published online an interview with me, part of a series about Italian expats. You can read it in English by pasting it into Google Translate. Please do not take every sentence, including the opening, as absolute. Besides what is lost in translation, some thoughts have been de-contextualized, without my opposition, I think to make the narrative more gripping.
The main difference? “That in America, the degree you buy it. In Italy you must deserve it “. Emanuele Viola left Italy in 2001, during his doctorate at the Sapienza University of Rome. “I gave up a scholarship for a PhD at Harvard – he recalls -. Then I moved to Princeton, Columbia and Boston. ” Today he is a professor of theoretical computer science at Northeastern University in Boston. Return? “Yes, I hope to come back one day”.
Emanuele, born in 1977, was born in Rome. At 14, he programmed the video game Nathan Never, followed by Black Viper. At the age of 24, he traveled to the United States for a doctorate in computer science at Harvard University, followed by a postdoc at the Institute of Advanced Study in Princeton and one at Columbia University. “Then I became a professor at Northeastern University in Boston, where I received my professorship a few years ago.”
The typical day may vary based on academic work. “Personally, I work better if I spend a lot of time at home in almost complete isolation – explains Emanuele -. If I do not have to teach, I usually stand in front of a blank sheet trying to solve some problems – continues – until finally it’s time for my walk in the woods, so at least in one thing I can feel close to Einstein and Darwin, “he smiles. “I go to university a few days a week to teach or to attend various meetings. But I often connect via Skype “.
Italy misses him a lot, has less time to visit and the difference with the American academic world is drastic: “American universities are direct as companies in competition with each other, constantly looking for more money, better teachers and better students. Here, after you’ve been admitted, it’s almost as if you already had a degree in your pocket. It’s not exactly like this in Italy: of the 200 of my course – he recalls – I was the only one who graduated in five years, that is, not going out of course “.
For Emanuele then, the academic world and Italian research has not only fund problems. Rather. “A hundred years ago, it was typical for an American scholar to spend a period of training in Europe – continues Emanuele -. In a few generations, the situation has exactly reversed “. In this sense the problem of Italy is also that of the rest of Europe and other parts of the world. “America has amassed so many brilliant minds from all over the world that it is very difficult for another nation to be competitive, regardless of funding. Indeed, those in the European community are substantial and competitive. Right now “in America there are not many funds – he specifies – especially for the theory”.
The situation is reversed for the doctorate. “Here it does not have a fixed duration: if you do not throw yourself out, you go out when you have competitive publications, so it can take you even six or seven years. In Italy, the pre-established duration is three years, once absolutely insufficient to produce competitive publications “. This difference is also due to the fact that in the United States the salary of the student comes from the advisor, in Italy mainly from a government grant.
If we talk about training, in short, the subject changes. “Personally, I consider the instruction I received almost gratuitously at Sapienza, much more solid than the typical American preparation. This however reverses completely for advanced studies. Here there are more chances for deserving students. In Italy there is very little research in my field “.
The most beautiful memories? The rare moments when the clear sensation of solving a mathematical problem arrives. “It happened to me once while rolling on my ball and three times while walking through cemeteries,” he smiles. The goal for Emanuele is to return to Italy, even if with the family in America it is not easy. “For some time I have been planning a sabbatical year in Italy. I hope to get in touch with the contacts and that maybe one day not too far they will translate into a return “.
The environment in a private university where taxes exceed 50 thousand dollars a year is completely different from “what I remember from my student days”. Yet Emanuele is keen to say something: “No, I do not want to give the impression that money makes a big difference. The fact is that America has succeeded in attracting the best minds from all over the world – he concludes -. And no other country has succeeded “.
Guest post by Abhishek Bhrushundi.
I would like to thank Emanuele for giving me the opportunity to write a guest post here. I recently stumbled upon an old post on this blog which discussed two papers: Nonclassical polynomials as a barrier to polynomial lower bounds by Bhowmick and Lovett, and Anti-concentration for random polynomials by Nguyen and Vu. Towards the end of the post, Emanuele writes:
“Having discussed these two papers in a sequence, a natural question is whether non-classical polynomials help for exact computation as considered in the second paper. In fact, this question is asked in the paper by Bhowmick and Lovett, who conjecture that the answer is negative: for exact computation, non-classical polynomials should not do better than classical.”
In a joint work with Prahladh Harsha and Srikanth Srinivasan from last year, On polynomial approximations over , we study exact computation of Boolean functions by nonclassical polynomials. In particular, one of our results disproves the aforementioned conjecture of Bhowmick and Lovett by giving an example of a Boolean function for which low degree nonclassical polynomials end up doing better than classical polynomials of the same degree in the case of exact computation.
The counterexample we propose is the elementary symmetric polynomial of degree in . (Such elementary symmetric polynomials also serve as counterexamples to the inverse conjecture for the Gowers norm [LMS11, GT07], and this was indeed the reason why we picked these functions as candidate counterexamples),
where is the Hamming weight of . One can verify (using, for example, Lucas’s theorem) that if and only if the least significant bit of is .
Theorem 1. Let be a polynomial of degree at most in . Then
[Emanuele’s note. Let me take advantage of this for a historical remark. Green and Tao first claimed this fact and sent me and several others a complicated proof. Then I pointed out the paper by Alon and Beigel [AB01]. Soon after they and I independently discovered the short proof reported in [GT07].]
The constant functions (degree polynomials) can compute any Boolean function on half of the points in and this result shows that even polynomials of higher degree don’t do any better as far as is concerned. What we prove is that there is a nonclassical polynomial of degree that computes on of the points in .
Theorem 2. There is a nonclassical polynomial of degree such that
A nonclassical polynomial takes values on the torus and in order to compare the output of a Boolean function (i.e., a classical polynomial) to that of a nonclassical polynomial it is convenient to think of the range of Boolean functions to be . So, for example, if , and otherwise. Here denotes the least significant bit of .
We show that the nonclassical polynomial that computes on of the points in is
The degree of this nonclassical polynomial is but I wouldn’t get into much detail as to why this is case (See [BL15] for a primer on the notion of degree in the nonclassical world).
Understanding how behaves comes down to figuring out the largest power of two that divides for a given : if the largest power of two that divides is then , otherwise if the largest power is at least then . Fortunately, there is a generalization of Lucas’s theorem, known as Kummer’s theorem, that helps characterize this:
Theorem 3.[Kummer’s theorem] The largest power of dividing for , , is equal to the number of borrows required when subtracting from in base .
Equipped with Kummer’s theorem, it doesn’t take much work to arrive at the following conclusion.
Lemma 4. if either or , where denotes the least significant bit of .
If is uniformly distributed in then it’s not hard to verify that the bits are almost uniformly and independently distributed in , and so the above lemma proves that computes on of the points in . It turns out that one can easily generalize the above argument to show that is a counterexample to Bhowmick and Lovett’s conjecture for every .
We also show in our paper that it is not the case that nonclassical polynomials always do better than classical polynomials in the case of exact computation — for the majority function, nonclassical polynomials do as badly as their classical counterparts (this was also conjectured by Bhowmick and Lovett in the same work), and the Razborov-Smolensky bound for classical polynomials extends to nonclassical polynomials.
We started out trying to prove that is a counterexample but couldn’t. It would be interesting to check if it is one.
Sometimes you see quantum popping up everywhere. I just did the opposite and gave a classical talk at a quantum workshop, part of an AMS meeting held at Northeastern University, which poured yet another avalanche of talks onto the Boston area. I spoke about the complexity of distributions, also featured in an earlier post, including a result I posted two weeks ago which gives a boolean function such that the output distribution of any AC circuit has statistical distance from for uniform . In particular, no AC circuit can compute much better than guessing at random even if the circuit is allowed to sample the input itself. The slides for the talk are here.
The new technique that enables this result I’ve called entropy polarization. Basically, for every AC circuit mapping any number of bits into bits, there exists a small set of restrictions such that:
(1) the restrictions preserve the output distribution, and
(2) for every restriction , the output distribution of the circuit restricted to either has min-entropy or . Whence polarization: the entropy will become either very small or very large.
Such a result is useless and trivial to prove with ; the critical feature is that one can obtain a much smaller of size .
Entropy polarization can be used in conjunction with a previous technique of mine that works for high min-entropy distributions to obtain the said sampling lower bound.
It would be interesting to see if any of this machinery can yield a separation between quantum and classical sampling for constant-depth circuits, which is probably a reason why I was invited to give this talk.
The organizers asked me to advertise this and I sympathize:
We are pleased to announce that we will provide pooled, subsidized child care at STOC 2018. The cost will be $40 per day per child for regular conference attendees, and $20 per day per child for students.
For more detailed information, including how to register for STOC 2018 childcare, see http://acm-stoc.org/stoc2018/childcare.html
Ilias Diakonikolas and David Kempe (local arrangements chairs)