AIHero

    AI Hero · Dictionary

    The vocabulary of AI coding, in plain English.

    Skimmable definitions for the terms that make agentic coding click. Search 62 entries below, or jump into a section.

    1,683mattpocock/dictionary-of-ai-coding

    The Model

    14 terms

    Model

    The parameters. Stateless — does next-token prediction and nothing else. Cannot do anything agentic on its own.

    Parameters

    The numbers inside a model — often billions — tuned during training. Everything the model knows lives in them. Also called weights.

    Training

    The process that sets a model's parameters by exposing it to vast amounts of text and adjusting to improve next-token prediction.

    Inference

    Running a trained model to generate output — what happens on every model provider request. Parameters stay fixed.

    Token

    The atomic unit a model reads and writes. Roughly word-sized but not exactly. Context window size, cost, and latency all count tokens.

    Next-token prediction

    What the model actually does. Samples one next token from the context, appends it, and runs again. Its only mode of operation.

    Non-determinism

    The same input can produce different output. A property of how models generate text and how providers serve requests.

    Model provider

    Whatever serves a model for inference. Usually remote (Anthropic, OpenAI, Google), but can also be local (Ollama, llama.cpp).

    Harness

    Everything around the model that turns it into an agent: tools, system prompt, context-window management, permissions, hooks.

    Model provider request

    One round-trip from the harness to the model provider. The harness sends context; the provider returns one response.

    Input tokens

    Tokens the harness sends on each model provider request. Billed at a lower rate than output tokens.

    Output tokens

    Tokens the model generates back. Billed at a higher rate than input tokens, since they cost more compute to produce.

    Prefix cache

    The provider-side store that lets consecutive requests skip re-processing a shared prefix, billing those tokens at a lower rate.

    Cache tokens

    Input tokens the provider has cached from a previous request via its prefix cache, billed at a much lower rate.

    Sessions, Context Windows & Turns

    8 terms

    Tools & Environment

    10 terms

    Failure Modes

    9 terms

    Handoffs

    7 terms

    Memory and Steering

    6 terms

    Patterns of Work

    8 terms

    Want more than vocabulary?

    Join AI Hero for practical skills, thinking on AI engineering, and resources that keep you ahead of the curve.

    I respect your privacy. Unsubscribe at any time.