The only things that matter in a theoretical study are those that you can prove, but it’s always fun to speculate. After worrying about P vs. NP for half my life, and having carefully reviewed the available “evidence” I have decided I believe that P = NP.
A main justification for my belief is history:
- In the 1950’s Kolmogorov conjectured that multiplication of -bit integers requires time . That’s the time it takes to multiply using the method that mankind has used for at least six millennia. Presumably, if a better method existed it would have been found already. Kolmogorov subsequently started a seminar where he presented again this conjecture. Within one week of the start of the seminar, Karatsuba discovered his famous algorithm running in time . He told Kolmogorov about it, who became agitated and terminated the seminar. Karatsuba’s algorithm unleashed a new age of fast algorithms, including the next one. I recommend Karatsuba’s own account  of this compelling story.
- In 1968 Strassen started working on proving that the standard algorithm for multiplying two matrices is optimal. Next year his landmark algorithm appeared in his paper “Gaussian elimination is not optimal” .
- In the 1970s Valiant showed that the graphs of circuits computing certain linear transformations must be a super-concentrator, a graph which certain strong connectivity properties. He conjectured that super-concentrators must have a super-linear number of wires, from which super-linear circuit lower bounds follow . However, he later disproved the conjectured : building on a result of Pinsker he constructed super-concentrators using a linear number of edges.
- At the same time Valiant also defined rigid matrices and showed that an explicit construction of such matrices yields new circuit lower bounds. A specific matrix that was conjectured to be sufficiently rigid is the Hadamard matrix. Alman and Williams recently showed that, in fact, the Hadamard matrix is not rigid .
- After finite automata, a natural step in lower bounds was to study sightly more general programs with constant memory. Consider a program that only maintains bits of memory, and reads the input bits in a fixed order, where bits may be read several times. It seems quite obvious that such a program could not compute the majority function in polynomial time. This was explicitly conjectured by several people, including . Barrington  famously disproved the conjecture by showing that in fact those seemingly very restricted constant-memory programs are in fact equivalent to log-depth circuits, which can compute majority (and many other things).
- [Added 2/18] Mansour, Nisan, and Tiwari conjectured  in 1990 that computing hash functions on bits requires circuit size . Their conjecture was disproved in 2008  where a circuit of size was given.
And these are just some of the more famous ones. The list goes on and on. In number-on-forehead communication complexity, the function Majority-of-Majorities was a candidate for being hard for more than logarithmically many parties. This was disproved in  and subsequent works, where many other counter-intuitive protocols are presented. In data structures, would you think it possible to switch between binary and ternary representation of a number using constant time per digit and zero space overhead? Turns out it is [11, 7]. Do you believe factoring is hard? Then you also believe there are pseudorandom generators where each output bit depends only on input bits . Known algorithms for directed connectivity use either super-polynomial time or polynomial memory. But if you are given access to polynomial memory full of junk that you can’t delete, then you can solve directed connectivity using only logarithmic (clean) memory and polynomial time . And I haven’t even touched on the many broken conjectures in cryptography, most recently related to obfuscation.
On the other hand, arguably the main thing that’s surprising in the lower bounds we have is that they can be proved at all. The bounds themselves are hardly surprising. Of course, the issue may be that we can prove so few lower bounds that we shouldn’t expect surprises. Some of the undecidability results I do consider surprising, for example Hilbert’s 10th problem. But what is actually surprising in those results are the algorithms, showing that even very restricted models can simulate more complicated ones (same for the theory of NP completeness). In terms of lower bounds they all build on diagonalization, that is, go through every program and flip the answer, which is boring.
The evidence is clear: we have grossly underestimated the reach of efficient computation, in a variety of contexts. All signs indicate that we will continue to see bigger and bigger surprises in upper bounds, and P=NP. Do I really believe the formal inclusion P=NP? Maybe, let me not pick parameters. What I believe is that the idea that lower bounds are obviously true and we just can’t prove them is not only baseless but even clashes with historical evidence. It’s the upper bounds that are missing.
 Josh Alman and R. Ryan Williams. Probabilistic rank and matrix rigidity. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 641–652, 2017.
 Allan Borodin, Danny Dolev, Faith E. Fich, and Wolfgang J. Paul. Bounds for width two branching programs. In Proceedings of the 15th Annual ACM Symposium on Theory of Computing, 25-27 April, 1983, Boston, Massachusetts, USA, pages 87–93, 1983.
 Leslie G. Valiant. Graph-theoretic arguments in low-level complexity. In 6th Symposium on Mathematical Foundations of Computer Science, volume 53 of Lecture Notes in Computer Science, pages 162–176. Springer, 1977.