Flu, by Gina Kolata

I enjoyed reading this book. Three themes are dear to me:

First, the macabre. Not that there is any shortage in the news, but the book also contains macabreness of a kind I did not expect.

Second, the drive for scientific research. It always makes for a compelling read when the reader is a scientist. And here it is also tinted with adventure.

Third, (recent) history. As I hinted in some previous posts, I started reading history in my late 20’s, as an unbelievably blank slate, and I haven’t stopped. It is not clear it is very consequential for the present, but sometimes it’s good to know how things “evolve.”

Chess.com: 5|5 > 1000

Today for the first time, I surpassed score 1000 on Chess.com playing 5|5, which means you start with a 5-minute budget, and every move you get 5 more seconds. For a while I also played 3|2 (2-second increments), but it takes me about 2 seconds to move a piece, which means I lost games in which I knew exactly what to do, but simply couldn’t move the pieces fast enough, which I found frustrating. Longer games I tried but I don’t seem to have the patience for.

I won’t reveal my id, because I feel bad about how much time I am spending losing at chess (and I think you could see all my games with my id, but I am not sure). My self-imposed limit is losing no more than one game a day, which means on average playing 2 games per day. (I had to stop and Google $\sum_i i/2^i = 2$; there’s a neat calculation-free proof of it which hopefully will make me remember this fact next time.)

However not being a robot, I sometimes get upset at the way I lose. Most of my games are classified as giveaway, which means I was winning according to the computer (and myself), but then because of some stupid mistake I end up losing the game. And so what the heck, I am better than this!, I break the rule and start another match — only to lose again, chess seems not to forgive hot heads.

The main reason why I play seems to be that fast-paced chess has the ability to completely absorb my mind, so it’s a good quick escape. Of course, there are also the little feel-good voices reminding me that it’s better than watching TV and that by playing I sharpen my mind.

While 1000 can of course be a ridiculously low bar by some standard, I found reaching it more difficult than I expected, and I like to think that the 5|5 format attracts stronger players, so that the competition is tougher, even though it may not be true. (But it does seem true that a certain score in a certain format does not correspond to the same score in a different format.) For one thing, I had to familiarize myself with several basic openings. I bought a little cute book Chess openings for kids which is good for people like me whose knowledge of chess openings was “e4 e5.” I don’t do anything fancy, but it was fun to read about common openings. I think I also wouldn’t mind playing random chess, but it seems harder to find opponents.

So why don’t you try and see what is your 5|5 score? And if you want to play sometimes, drop me a line.

Windows never changes

For months, Windows 10 complained that it didn’t have enough space on the hard disk, but the options it gave me to clean up space were ridiculous. Worse, the “storage” function that supposedly tells you what’s taking space wasn’t even close to the truth. This became so bad that I was forced to remove some things I didn’t want to remove, often with a lot of effort, because space was so tight that Windows didn’t even have enough to run the uninstaller! In the end I became so desperate that I installed TreeSize Free. It quickly revealed that crash plan was taking up a huge amount of space. This revealed to be associated to the Code42 program — a program that the system was listing as taking 200MB. Well, uninstalling Code42 freed SIXTY PERCENT of the hard disk space, 140GB.

Data-structure lower bounds without encoding arguments

I have recently posted the paper [Vio21] (download) which does something that I have been trying to do for a long time, more than ten years, on and off. Consider the basic data-structure problem of storing $m$ bits of data $x\in \{0,1\}^{m}$ into $m+r$ bits so that the prefix-sum queries

\begin{aligned} \mathbb {\text {\textsc {Rank}}}(i):=\sum _{j\le i}x_{j} \end{aligned}

can be computed by probing $q$ cells (or words) of $w$ bits each. (You can think $w=\log m$ throughout this post.) The paper [PV10] with Pǎtraşcu shows that $r\ge m/w^{O(q)}$, and this was recently shown to be tight by Yu [Yu19] (building on the breakthrough data structure [Pǎt08] which motivated the lower bound and is not far from it).

As is common in data-structure lower bounds, the proof in [PV10] is an encoding argument. In the recently posted paper, an alternative proof is presented which avoids the encoding argument and is perhaps more in line with other proofs in complexity lower bounds. Of course, everything is an encoding argument, and nothing is an encoding argument, and this post won’t draw a line.

The new proof establishes an intrinsic property of efficient data structures, whereas typical proofs including [PV10] are somewhat tailored to the problem at hand. The property is called the separator and is a main technical contribution of the work. At the high level the separator shows that in any efficient data structure you can restrict the input space a little so that many queries are nearly pairwise independent.

Also, the new proof rules out a stronger object: a sampler (see previous post here on sampling lower bounds). Specifically, the distribution Rank$(U)$ where $U$ is the uniform distribution cannot be sampled, not even slightly close, by an efficient cell-probe algorithm. This implies the data-structure result, and it can be informally interpreted as saying that the “reason” why the lower bound holds is not that the data is compressed, but rather that one can’t generate the type of dependencies occurring in Rank via an efficient cell-probe algorithm, regardless of what the input is.

Building on this machinery, one can prove several results about sampling, like showing that cell-probe samplers are strictly weaker than AC0 samplers. While doing this, it occurred to me that one gets a corollary for data structures which I had not seen in the literature. The corollary is a probe hierarchy, showing that some problem can be solved with zero redundancy ($r=0$) with $O(q)$ probes, while it requires almost linear $r$ for $q$ probes. For example I don’t know of a result yielding this for small $q$ such as $q=O(1)$; I would appreciate a reference. (As mentioned in the paper, the sampling viewpoint is not essential and just like for Rank one can prove the data-structure corollaries directly. Personally, and obviously, I find the sampling viewpoint useful.)

One of my favorite open problems in the area still is: can a uniform distribution over $[m]$ be approximately sampled by an efficient cell-probe algorithm? I can’t even rule out samplers making two probes!

References

[Pǎt08]   Mihai Pǎtraşcu. Succincter. In 49th IEEE Symp. on Foundations of Computer Science (FOCS). IEEE, 2008.

[PV10]   Mihai Pǎtraşcu and Emanuele Viola. Cell-probe lower bounds for succinct partial sums. In 21th ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 117–122, 2010.

[Vio21]   Emanuele Viola. Lower bounds for samplers and data structures via the cell-probe separator. Available at http://www.ccs.neu.edu/home/viola/, 2021.

[Yu19]    Huacheng Yu. Optimal succinct rank data structure via approximate nonnegative tensor decomposition. In Moses Charikar and Edith Cohen, editors, ACM Symp. on the Theory of Computing (STOC), pages 955–966. ACM, 2019.

Questions on the future of lower bounds

Will any of the yellow books be useful?

Book 576 (not pictured) was saved just in time from the paper mill. It was rumored that Lemma 76.7.(ii) could have applications to lower bounds. Upon closer inspection, that lemma has a one-line proof by linearity of expectation if you change the constant 17 to 19. This change does not affect the big-Oh.

Will the study of randomness lead to the answer to any of the questions that are open since before randomness became popular? I think it’s a coin-toss.

Will there be any substance to the belief that algebraic lower bounds must be proved first?

Will the people who were mocked for working on DLOGTIME uniformity, top fan-in k circuits, or ZFC independence have the last laugh?

Will someone switch the circuit breaker and lit up CRYPT, DERAND, and PCPOT, or will they remain unplugged amusement parks where you sit in the roller coaster, buckle up, and pretend?

Will diagonalization be forgotten, or will it continue to frustrate combinatorialists with lower bounds they can’t match for functions they don’t care about?

Will decisive progress be made tonight, or will it take centuries?

Only Ketan Mulmuley knows for sure.

Submitting to ICALP 2021

I am taking advantage of the pandemic to participate in conferences where it’s usually hard for me to participate because they require intercontinental travel. I have a paper in CSR 2021, and now am submitting a paper to ICALP 2021 (I count submission as participation, even in case the paper gets rejected). The latter requires submissions to be formatted in a specific way, a topic discussed at length on this blog.

Begin 12:48

Try to compile their sample paper.

Google solution. Says to install packages.

Unfortunately, I am using windows but I only have LyX, and the solution expects MixTeX.

Google how to install packages in LyX.

Can’t find anything simple.

Create a new document on overleaf.

Copy all the LIPIcs files there.

Try to compile their sample.

It works!

Paste my latex.

Usual avalanche of problems to be fixed at the speed of light.

Add dummy section “Introduction” which wasn’t in my paper, otherwise theorem numbers look weird.

Numbers still look weird. Something’s wrong with theorem statements.

Replace {thm} with {theorem}

Looks better. Still some wrong stuff all around, however.

No it wasn’t that. Remove the dummy section. It seems their “paragraph” environment puts strange numbers like 0.0.0.1

Replace \paragaph with \paragaph* everywhere

Actually, looks weird the way they put the ack– put back the dummy Introduction section.

Check page limit: no more than 12 pages, excluding references

I’m a little over. Does this really matter? Apparently, it does! Move last proof to the appendix. Actually, last proof is kind of short, I should move the penultimate proof. Update paper organization (next time I shouldn’t put it).

Final look. Fix a few indentations.

OK, time to actually submit. Go to the easychair website. They want me to re-enter all the information!? Why, after forcing me to enter title, keywords, etc. in their format, are they asking me to do this again? Can’t we just send the .tex file and extract it from there?

Oh come one, it’s just a few seconds of copy-paste.

OK, done, paper submitted.

End: 3:05

Well, next time it will be easier. Perhaps easier, but not easy because as the reader knows there will be another missing package, another incompatible system, etc. And of course, if the paper is rejected, then I won’t even save the time to convert it into camera-ready format. On the other hand, the benefit is non-existent. It would be better for everyone if in order to submit a paper you have to complete a random 1-hour task on Amazon mechanical Turk and donate the profit to charity.

Et Al. II

From Thoughts, :

The et al. citation style favors scholars whose last name comes early in the dictionary. For example, other things equal, a last name like Aaron would circulate a lot more than Zuck. This problem is compounded by the existence of highly-cited papers which deviate from alphabetical ordering of authors. They carry the message: order matters, and some of you can’t use this trick, vae victis!

My suggestion is to avoid et al. and instead spell out every name (as in Aaron and Zuck) or every initial (as in AZ). It isn’t perfect, but improvements like randomly permuting the order still aren’t easy to implement. The suggestion actually cannot be implemented in journals like computational complexity which punish the authors into using an idiosyncratic style which has et al. But it doesn’t matter too much; nobody reads papers in those formats anyway, as we discussed several times.

From the STOC 2021 call for papers:

Authors are asked to avoid “et al.” in citations in favor of an equal mention of all authors’ surnames (unless the number of authors is very large, and if it is large, consider just using \cite{} with no “et al.”). When not listing authors’ names, citations should preferably include the first letters of the authors’ surnames (or at least the first three followed by a +, and possibly the year of publication). If using BibTeX, this can be accomplished by using \bibliographystyle{alpha}.

The role of government

Is there anything in this world which works well, which is not a series of time-consuming, useless steps inherited from the before-Internet era, of endless waits to speak to someone who behaves like a robot in none of the good ways and all the bad?

I hear linear algebra.

Yes, I think that’s as good as it gets.

If I had one penny for every time I said “V like Victor” over the phone… A game-changing innovation of the unfolding millennium will be a button which, upon pressure, instantly releases all your information to wh(o/at)ever needs to know it. Instead, they now do us the favor of “safeguarding our privacy” by forcing us to enter and re-enter and re-re-renter our information; us who don’t have someone who does it in our stead.

In yet another example of adding insult to injury, much of this is cloaked under “Consumer protection.” While in fact its effect is precisely the opposite: it is to cut the consumer off the market. For example, there’s a rule that you must wait three days to close on a loan. This is supposed to help you sleep over it and think if you really want to go for it and if you can afford it. How nice of them to help us avoid rushed decisions! Obviously, sellers don’t want to deal with people subject to such delays. If one really wanted to help, the rule should be that everybody has to wait three days to complete a transaction, regardless if it’s cash or financed, so that those for whom the transaction is more significant than drinking a tea can indeed sleep over it.

More absurdities. Person A currently has loan with interest rate X. Now interests have been set to Y < X. However, A cannot refinance their loan because A does not have enough cash reserves (or some other suspicious condition). So A will be forbidden from refinancing and will be forced to keep their loan at old, unfavorable interest rate X > Y.

Question: Under which interest rate is A more likely to default?

The only possible deduction is that the lender wants A to default. You can’t say that by refinancing the lender will take a greater risk. They are just using this loophole to make more money. Wasn’t this supposed to help citizens? This situation has another name as well.

Special torture awaits serial debtors. Another loan in effect can delay processing by… months. In the meanwhile, the documents for the new loan will expire, and re-expire, and re-re-expire yet again, while the hapless debtor will be forced to continuously work to keep them up to date — similar to what happens to Sisyphus or the Danaides, but the classics weren’t perverse enough to conceive the endless task of collecting documents from disconnected websites, each with its own distinctive and rapidly-mutating user interface, each requiring a new password or dual-factor authentication at each download, each in turn disappearing and being replaced by yet another system hosted by yet another party which will require yet another type of authentication, and so on forever — do you want this money or not?

There is a (probably well-known) quote which I read in The Instinct for Cooperation : A Graphic Novel Conversation with Noam Chomsky (not a terrible read — I fittingly found it in a free exchange library and returned it to another) which stuck with me. I don’t recall it exactly (see, I shouldn’t have returned it), and I can’t find it online precisely, but it basically says that “The role of the government is to protect property from those who don’t have it.” (An online search seems to reveal that something along these lines was expressed by James Madison.) Regardless of whether it was said or not, and regardless of whether you agree with it or not, it is quite useful to keep this quote in mind, because it explains much of the world we live in.

Marijuana brings good to your community

Subsequence of official communication from Newton:

My monitor setup

In the not-distant future there will be a single monitor that gets you the best of both worlds. For the contemporaneous, I maintain that the above is the best monitor setup available to us in 2020. I use the tiny E-ink monitor as much as possible, including now, for my blitz matches on chess.com, and of course for writing and sometimes reviewing papers. But as I mentioned earlier unfortunately for certain bureaucratic tasks that not all of us can skip altogether you just need a bigger monitor with color. So I push a button on the hdmi switch, and the image blasts open on the 30-inch screen, m’illumino d’immenso, and suddenly the mouse feels like the interface from Minority Reports.