Eliminate all formatting requirements (+ survival tip)

Our conference submission was just desk rejected because the PC is unwilling to stop reading at Page 12.

We were asked to format submissions according to the LIPIcs style file, which we did, and to limit the submission length to 12 pages, excluding references; omitted proofs could be placed in the appendix. Our submission was 14.5 pages, excluding references, with no appendix.

Over the years I have submitted and also reviewed many papers that went slightly beyond the page limit; and nobody paid attention. In a few cases I have also seen PC chairs asking for the resubmission of papers which egregiously violated the formatting requirements, such as crammed, 10-point, 2-column submissions. In the present case, despite our good intentions witnessed by the usage of the LIPIcs style file, our misplacement of the \bibliography command was just too severe to be offered a second chance.

In general, there have been many discussions about formatting requirements, for example here.  The summary of my position is in the title, and details follow.

Of all the useless, time-consuming rules, the one of imposing formatting requirements strikes me as particularly outrageous because it is one that can actually be fixed. I think it is clear to everyone that a proper formatting of the paper has zero relevance compared to the myriad of actual problems that can affect submissions, such as a poorly written introduction, no intuitive exposition of the proof techniques, or missing citations. The only definite outcome of formatting requirements is that authors waste time.

I like the following paragraph from this article:

Science fiction novels of a half-century ago dramatized conflicts between humans and robots, asking if people were controlling their technologies, or if the machines were actually in charge. A few decades later, with the digital revolution in juggernaut mode, the verdict is in. The robots have won. Although the automatons were supposedly going to free people by taking on life’s menial, repetitive tasks, frequently, technological innovation actually offloads such jobs onto human beings.

I ponder upon it every time I need to waste 30 minutes with LaTeX, instead of being given the obvious option of submitting a .pdf file and possibly a .tex file from which a computer would automatically extract author names, title, and all other relevant information, including categories. They too can be quite accurately deduced from the text of the paper, can’t they? The waste of time is magnified if your paper has pictures (ours had two, carefully prepared with LaTeX extensions). No wonder we don’t see many pictures in papers.

I am not aware of any benefit of forcing papers into different latex styles, except one. Publishers can better monetize our donations if they are properly formatted. So this is one more reason to kill the current editorial system. The role of a conference/journal should be to place quality stamps on online papers, not to engulf people into a battle with latex.

Survival tip: One day I felt particularly vexed by being forced to convert all my LaTeX single-line equations, which fit nicely on a single-column format, into align environments that could fit in the 2-column proceedings format. So I offered $100 for a package that would make the process easier. I wanted a package that could recognize if I was using the “&” or the “\\” commands, and if so switch to align, or to multline. And it should also recognize if I am not using labels (in which case I need align*, not align), and so on.

Of course, a LaTeX saint quickly provided a package, and turned down my $100. I used this package ever since. I only write \[ \] and then magically the computer knows what I need. At least for this skirmish, I have won and have offloaded a mindless task onto a computer…

…though good luck making this work with the next formatting requirements.

Communication Complexity in Banff

I am back from the first ever workshop on communication complexity, held in Banff.  As if to prove the great unifying perspective of communication complexity, the workshop had talks on a wide variety of topics, from data structures to circuit lower bounds, and to streaming algorithms.

On the first day Larsen gave a survey talk on static data structures.  I hadn’t realized that it is still open to prove a superlogarithmic lower bound for any explicit problem — with polynomially many queries and linear space.  Larsen mentioned a sampling technique which he refined to obtain such logarithmic bounds, and which he patiently reexplained to me after his talk and even after the workshop.

So consider the problem of storing a polynomial of degree d over a finite field F of size |F| = d^2 such that you can answer evaluations by only probing t cells, possibly adaptively.  All these bounds are in the cell-probe model, with cells of log|F| = 2 log d bits.  We will show that an efficient data structure using linear space S = O(d) implies an encoding of the polynomial with length less than (d+1) log |F| bits, which is impossible.

The idea for the encoding is simply to select from the S cells of the data structure a uniform set of pS cells, and to write them down together with their indexes.  (Note that adding the index at most doubles the number of cells, because you start with S cells and a cell has 2 log d bits.)  The length 2 pS log |F| of this encoding will be less than (d+1) log |F| as soon as p is a small enough constant.  To see that this is sufficient to reconstruct a polynomial, fix one.  For every query that the data structure makes, you will have in the encoding all the t cells needed to answer it with probability (S – t choose pS – t) / (S choose pS) = Omega(p^t).  There are |F| queries, which means you expect to be able to answer |F| p^t queries, which in turn means that there exists a set which allows you to answer that many.  If t = o(log d), the quantity |F| p^t is > d, and you get your polynomial back by simple interpolation.

This can be extended to boolean queries by considering the d^4 queries (x,a) with meaning: does the polynomial evaluated at x equal a?  One can run the same argument as above but only focusing on the queries whose answers is “yes.”

Beating this logarithmic bound strikes me as a fundamental question. Though after being in Banff the real question is why isn’t life one very long Banff workshop, where you have an endless stream of interesting talks and amazing companions for hikes, and the cafeteria serves an uncommon variety of foods — always including a choice of plain vegetables and starch with absolutely no seasoning — to consume within glass walls overlooking the rocky mountains.

Weinstein gave a survey talk about information complexity, which was followed by a great talk on the very recent work by Ganor, Kol, and Raz showing a separation between information and communication.  I hadn’t thought much about information complexity, so I can’t say I am surprised by this result, but I heard there were some researchers who believed that information equals communication.  This result has the nice philosophical interpretation that sometimes long conversations in which little information is exchanged — an experience which may be common — are necessary.

Yehudayoff gave a nice presentation about the communication complexity of disjointness, covering his simplification with Rao of a recent lower bound by Sherstov.  Sherstov’s argument first bounds the correlation of c-communication protocols with the parity of m disjoint copies of k-party disjointness on l bits, by 2^c (4^k/l)^m.  This step is similar to the classical Babai-Nisan-Szegedy lower bound for parity of ands, and although it requires a bit of extra work, it is as expected.

From this, an n/4^k lower bound for deterministic protocols follows immediately, because a protocol for disjointness implies one with correlation 2^{-m} with the above parity.  (Just set l = 10 x 4^k and note this yields the lower bound c > m = Omega(n / 4^k), for else the bound in the previous paragraph would be a lot less than the bias 2^{-m}.)

The bound for randomized protocols is more involved.  Roughly, the next step is to infer from the above bound various bounds on the correlation conditioned on the number of copies of disjointness being one.  For this, Sherstov defines and makes use of a distribution on each copy that is rerandomizable: given an instance, the players can sample fresh instances with the same value.  This step is also not too hard to grasp.  What appears to be very ingenious is the last step where you relate these bounds to disjointness itself, by showing that these protocols can be turned into a good, low-degree approximation for the and function on n bits, in the infinity norm, which is known to be impossible.  I understand it is this last step where the bulk of the simplification lies; Sherstov’s argument is replaced with a reportedly more direct argument which is basically just an appropriate linear transformation.

McGregor presented a surprising result.  Suppose n players correspond to the n nodes of a graph, and player i knows the nodes adjacent to i but does not know anything else in the graph.  It is possible for each player to send polylog(n) information to a referee who will then determine if the graph is connected (with a small error probability, taken over public coins).  The proof is just as neat as the result.  They use a classical algorithm for deciding connectivity which goes by growing a forest: at each of O(log n) rounds, each (super)node selects arbitrarily a neighbor, thus forming supernodes and shrinking the graph.  They then show how this algorithm can be run with only sketches of the neighborhoods; for this they need sketches that are linear and from which an element in the support of the original vector can be determined, which were already obtained.

Chattopadhyay spoke about his work with Edmonds, Ellen, and Pitassi on a conjecture by Patrascu.  Patrascu considered a 3-party number-on-forehead model where Alice holds k n-bit strings x_1, …, x_k, Bob holds one n-bit string y, and Charlie holds an index i <= k.  The goal is to determine f(x_i, y) for a boolean function f, with the restriction that Charlie sends an o(k)-bit private message to Alice alone, who then exclusively talks to Bob.  An inspiring connection by Patrascu shows that strong bounds in this model for functions f such as disjointness would result in breakthrough lower bounds for dynamic data structures.  Patrascu made a bold conjecture that if the communication complexity of f is high, the communication for the above task remains high.

Not only was I also thinking about this conjecture and only had made negligible headway so far, but I did not even know of their beautiful paper which disproves it in a very strong way: they exhibit a function f which requires linear randomized communication, but that can be solved with polylogarithmic communication in Patrascu’s model.  But it is still open whether strong enough bounds hold for, say, disjointness, which would be sufficient for Patrascu’s intended application to data structures.  (For deterministic protocols it is quite easy to show a strong separation for equality — you may want to try that for fun.)

And I talked about the communication complexity of addition.  Suppose each of k = O(1)  players holds an n-bit integer, and they want to know if their integers sum to 0.  Nisan lets each player send an O(log n)-bit hash of its integer, obtained by taking the remainder modulo a random prime.  This yields a protocol with O(log n) communication, whose correctness relies on the linearity of the hashing.

We obtain an O(1)-communication protocol using a different hash function which is almost linear.  This improvement is useful for other problems such as deciding if the sum is bigger than 0 or evaluating polynomial threshold functions.