A technical blog

Notes on language models, agents, and the ways they break.

I write about the inside of LLM systems like decoding, evals, memory, and multi-agent coordination, usually with a benchmark or a worked example to anchor each post. Long-form, occasionally interactive, mostly things I wish someone had written before I had to figure them out.

Featured

Worth reading first