Computational Geometrization

"Given Tri Dao, why not Quad Dao?"

Danielle Fong · March 2026 · Map Reduce for Thought

14proven

8conjectures

5refuted

12physicists

4AI models

151KBevidence

The 8 Thurston Geometries as building blocks of computation

The 8 Shapes of Computation

Thurston proved — Perelman completed — that every possible 3D shape is built from exactly 8 types of geometric clay. No more. No less. Glued together along donut-shaped boundaries called tori.

Four of these clays let you spread out slowly (polynomial growth). Four let you spread out fast (exponential growth). If you're a colony of tiny computers living inside one of these shapes, the geometry determines how much you can compute. This is a theorem, not an analogy.

📐

E³

flat grid · r³

🔮

S³

sphere · finite

FINITE

🌐

S²×R

globe+line · r

🌀

Nil

spiral stairs · r⁴

🪸

H³

coral reef · e^r

PSPACE

〰️

H²×R

saddle layers · e^r

PSPACE

♾️

SL̃(2,R)

Möbius flow · e^r

PSPACE

🍬

Sol

taffy pull · e^r (!)

???

Blue = polynomial growth = class P · Red = exponential growth = PSPACE · Gold = Sol (the genuine open problem)

The Socratic Argument

SOCRATES

When you tile FlashAttention into SRAM blocks — what do you throw away at each boundary?

DANIELLE

The attention scores. You keep (m, l) — the running max and log-sum-exp.

SOCRATES

And why does that work?

DANIELLE

Because log-sum-exp is associative. Split-and-merge without error.

SOCRATES

Could every computation be split this way without error?

DANIELLE

...no. Most operations aren't associative.

SOCRATES

So what's special about the ones that are?

FlashAttention = Topological Surgery

Manifold cutting maps to GPU SRAM tile layout

Flat piece (E³)

↔

SRAM tile

Full manifold (too big)

↔

Full attention matrix

Torus boundary (Z²)

↔

Tile boundary → (m, l)

Incompressible torus

↔

Can't eliminate this boundary

π₁ trivial inside each piece

↔

log-sum-exp is associative

JSJ = fewest possible tori

↔

Optimal tiling strategy

The Thermodynamic Tax

LANDAUER

Every bit you destroy costs kT ln 2 of heat. My theorem (1961). Verified experimentally (Bérut 2012).

LANDAUER

Each FlashAttention tile boundary destroys ~7 bits per query row. 256 scores compressed to 2 statistics. That's genuine information erasure.

SOCRATES

Does the canonical topological decomposition minimize this cost?

LANDAUER

Yes. Non-trivially. Incompressibility equals information-theoretic necessity. Every JSJ torus carries irreducible information. The canonical decomposition IS thermodynamically optimal. That surprised me.

The Punchline: Why 3?

2D (too simple) — 3D (the sweet spot) — 4D (undecidable)

Genus classification

Too simple.
Nothing interesting happens.

Thurston's 8 geometries

The sweet spot.
Rich enough for PSPACE.
Structured enough to classify.

Markov 1958

Undecidable.
Can never find optimal cuts.
Uncountably many exotic structures.

Given Tri Dao, why not Quad Dao?

Because 3-manifold topology is the richest dimension where computation is still classifiable. FlashAttention is the existence proof — it tiles attention along JSJ boundaries and achieves exact results because the algebra is associative. The 8 Thurston geometries are the periodic table of computational atoms. The torus boundaries are where you pay the thermodynamic tax. The optimal decomposition — minimum tax, exact computation — is JSJ.

In 4D, none of this works. You can never classify the shapes. You can never find the optimal cuts.

π₁ is the Dao. The fundamental group — what doesn't change when everything else does.
The Way persists through all transformation.

Nominative determinism stays winning.