Computational Geometrization

"Given Tri Dao, why not Quad Dao?"
Danielle Fong · March 2026 · Map Reduce for Thought
14proven
8conjectures
5refuted
12physicists
4AI models
151KBevidence
The 8 Thurston Geometries as building blocks of computation

The 8 Shapes of Computation

Thurston proved — Perelman completed — that every possible 3D shape is built from exactly 8 types of geometric clay. No more. No less. Glued together along donut-shaped boundaries called tori.

Four of these clays let you spread out slowly (polynomial growth). Four let you spread out fast (exponential growth). If you're a colony of tiny computers living inside one of these shapes, the geometry determines how much you can compute. This is a theorem, not an analogy.

📐
flat grid · r³
P
🔮
sphere · finite
FINITE
🌐
S²×R
globe+line · r
P
🌀
Nil
spiral stairs · r⁴
P
🪸
coral reef · er
PSPACE
〰️
H²×R
saddle layers · er
PSPACE
♾️
SL̃(2,R)
Möbius flow · er
PSPACE
🍬
Sol
taffy pull · er (!)
???

Blue = polynomial growth = class P  ·  Red = exponential growth = PSPACE  ·  Gold = Sol (the genuine open problem)

The Socratic Argument

SOCRATES
When you tile FlashAttention into SRAM blocks — what do you throw away at each boundary?
DANIELLE
The attention scores. You keep (m, l) — the running max and log-sum-exp.
SOCRATES
And why does that work?
DANIELLE
Because log-sum-exp is associative. Split-and-merge without error.
SOCRATES
Could every computation be split this way without error?
DANIELLE
...no. Most operations aren't associative.
SOCRATES
So what's special about the ones that are?

FlashAttention = Topological Surgery

Manifold cutting maps to GPU SRAM tile layout
Flat piece (E³)
SRAM tile
Full manifold (too big)
Full attention matrix
Torus boundary (Z²)
Tile boundary → (m, l)
Incompressible torus
Can't eliminate this boundary
π₁ trivial inside each piece
log-sum-exp is associative
JSJ = fewest possible tori
Optimal tiling strategy

The Thermodynamic Tax

The Landauer Torus — information compressed through the boundary
LANDAUER
Every bit you destroy costs kT ln 2 of heat. My theorem (1961). Verified experimentally (Bérut 2012).
LANDAUER
Each FlashAttention tile boundary destroys ~7 bits per query row. 256 scores compressed to 2 statistics. That's genuine information erasure.
SOCRATES
Does the canonical topological decomposition minimize this cost?
LANDAUER
Yes. Non-trivially. Incompressibility equals information-theoretic necessity. Every JSJ torus carries irreducible information. The canonical decomposition IS thermodynamically optimal. That surprised me.

The Punchline: Why 3?

2D (too simple) — 3D (the sweet spot) — 4D (undecidable)
2D
Genus classification
Too simple.
Nothing interesting happens.
3D
Thurston's 8 geometries
The sweet spot.
Rich enough for PSPACE.
Structured enough to classify.
4D
Markov 1958
Undecidable.
Can never find optimal cuts.
Uncountably many exotic structures.
Given Tri Dao, why not Quad Dao?

Because 3-manifold topology is the richest dimension where computation is still classifiable. FlashAttention is the existence proof — it tiles attention along JSJ boundaries and achieves exact results because the algebra is associative. The 8 Thurston geometries are the periodic table of computational atoms. The torus boundaries are where you pay the thermodynamic tax. The optimal decomposition — minimum tax, exact computation — is JSJ.

In 4D, none of this works. You can never classify the shapes. You can never find the optimal cuts.

π₁ is the Dao. The fundamental group — what doesn't change when everything else does.
The Way persists through all transformation.

Nominative determinism stays winning.