- The Case of the Frankenstein Key
· One key, two secrets, and the checksum that isn't there.
- Too Much Attention?
· 128k tokens of context. Perfect retrieval. Zero reasoning.
- How Far Can Attention Reach?
· Stretching a 4k model to 128k---what changes, what breaks, what costs.
- What Is Attention?
· Tracing a real transformer, one tensor at a time.
- XOR, Clifford, and Bottled Nonlinearity
· Stabilizers, Magic States, and Why T Gates Are Expensive
- Leanly, Lightly: Proving IsZero
· What a tiny program can teach us about machines and correctness