AIHero

    Sessions, Context Windows & Turns

    Context window

    Everything the model sees on each model provider request. Finite, model-specific, the only surface through which the model perceives.

    Matt Pocock
    Matt Pocock

    Everything the model sees on each model provider request. Finite, model-specific, and the only surface through which the model perceives anything.

    It's a single sequence of tokens: the system prompt, the conversation so far, every tool result the harness has fed back in. If something is in that sequence, the model can use it; if it isn't, the model doesn't know it exists — not your codebase, not the file you edited yesterday, not the instruction you gave three sessions ago. Anything outside the window has to be brought in, usually via a tool call, before it can affect anything.

    Finite means it fills up. Every turn appends more — your messages, the model's responses, tool results — and a long session will eventually hit the limit, forcing compaction or clearing. It also means everything in the window competes: each token you load is one less available for the rest, and content you didn't need still occupies the model's attention. The practical stance is to treat the window as a budget — load what the task needs, leave the rest out.

    Avoid: "memory" — the context window is working state and doesn't persist across sessions. Memory is a separate concept layered on top.

    Usage:

    "Can I just paste the whole monorepo into the prompt?"

    "The context window's 200k tokens — that's maybe a fifth of the repo. Pick the files the task touches, leave the rest behind a tool call."

    Want more than vocabulary?

    Join AI Hero for practical skills, thinking on AI engineering, and resources that keep you ahead of the curve.

    I respect your privacy. Unsubscribe at any time.

    Share