Essays and articles on AI reliability. Select any card to read.
Loop engineering moved the work from prompting the agent to designing the system that prompts it. It also moved the risk. When the loop runs while you sleep, “done” becomes the most expensive word in the system.
An accuracy score is an average, and the average is where the failure hides. One public autopsy, every number rerunnable.
Why every claim we make ships with code you can run.
The silent cost of AI-assisted engineering.
Why the most dangerous failure mode in machine learning is also the most dangerous failure mode in thinking.