ciphertext.blog

Slow notes on unraveling opaque things: cryptography, typography, cartography, and whatever else looks like ciphertext until it doesn't.

  1. Too Much Attention?
    · 128k tokens of context. Perfect retrieval. Zero reasoning.
  2. How Far Can Attention Reach?
    · Stretching a 4k model to 128k---what changes, what breaks, what costs.
  3. What Is Attention?
    · Tracing a real transformer, one tensor at a time.
  4. XOR, Clifford, and Bottled Nonlinearity
    · Stabilizers, Magic States, and Why T Gates Are Expensive
  5. Leanly, Lightly: Proving IsZero
    · What a tiny program can teach us about machines and correctness