AIHero

    The Model

    Next-token prediction

    What the model actually does. Samples one next token from the context, appends it, and runs again. Its only mode of operation.

    Matt Pocock
    Matt Pocock

    What the model actually does. Given a context, it samples one next token, appends it, and runs again. Every output — a sentence, a tool call, a thousand-line file — is built one token at a time. The model has no other mode of operation.

    Usage:

    "How does the agent 'decide' to call a tool?"

    "It doesn't — it's next-token prediction all the way down. The tool call is just a structured string the harness parses out of the output stream."

    Want more than vocabulary?

    Join AI Hero for practical skills, thinking on AI engineering, and resources that keep you ahead of the curve.

    I respect your privacy. Unsubscribe at any time.

    Share