Section 1 — The Model

Next-token prediction

What the model actually does. Given a context, it samples one next token, appends it, and runs again. Every output — a sentence, a tool call, a thousandline file — is built one...

Matt Pocock

What the model actually does. Given a context, it samples one next token, appends it, and runs again. Every output — a sentence, a tool call, a thousand-line file — is built one token at a time. The model has no other mode of operation.

Usage:

"How does the agent 'decide' to call a tool?"

"It doesn't — it's next-token prediction all the way down. The tool call is just a structured string the harness parses out of the output stream."

Share

Next-token prediction

Want more than vocabulary?