The model's output naming a tool and its arguments — just structured text. It doesn't do anything on its own; the harness has to read it and execute. Produced by the model in one model provider request.
The lifecycle of a tool call:
| Step | Who | What happens |
|---|---|---|
| 1 | Model | Learns which tools exist from descriptions in the system prompt |
| 2 | Model | Emits a call — tool name plus arguments, usually JSON — and stops |
| 3 | Harness | Parses the call and checks it against the permission mode |
| 4 | Harness | Executes it if allowed |
| 5 | Harness | Sends the outcome back as a tool result in the next request |
One turn of agent work is usually many of these round trips chained together.
Because the call is generated by next-token prediction like everything else, it can be wrong the way any model output can be wrong: a path that doesn't exist, a flag the command doesn't have, arguments that are plausible rather than correct. The harness executes what was written, not what was meant — a mistyped path doesn't error gracefully, it edits the wrong file.
Usage:
"It said it ran the tests but the file timestamps haven't changed."
"Look at the transcript — did it actually emit a tool call, or just describe running them? The model produces the call, but if the harness didn't execute it, nothing happened."