Section 1 — The Model
Next-token prediction
What the model actually does. Given a context, it samples one next token, appends it, and runs again. Every output — a sentence, a tool call, a thousandline file — is built one...
Matt Pocock
What the model actually does. Given a context, it samples one next token, appends it, and runs again. Every output — a sentence, a tool call, a thousand-line file — is built one token at a time. The model has no other mode of operation.
Usage:
"How does the agent 'decide' to call a tool?"
"It doesn't — it's next-token prediction all the way down. The tool call is just a structured string the harness parses out of the output stream."
Share