Tracking Token Usage

Okay, the solution here is to await output.usage and log it to the console.

console.log(await output.usage)

We can see that output.usage has several different properties on it.

/*
cachedInputTokens?
inputTokens
outputTokens
reasoningTokens?
totalTokens
*/

We have the inputTokens. In other words, this is the number of tokens in our input, in our prompt, our system prompt, whatever.

Then we have the outputTokens. These are sometimes called completion tokens. This is the number of tokens that the LLM generated.

Some models also produce reasoningTokens. This is for quote-unquote thinking models and they may bill you differently for reasoning tokens or output tokens or input tokens.

Finally you've got the totalTokens used here too.

When we run this we can see that we get the streaming output down and then we get the usage logged here.

{
  inputTokens: 13,
  outputTokens: 127,
  totalTokens: 140,
  reasoningTokens: undefined,
  cachedInputTokens: undefined
}

By the way for those intrigued I'm going to talk about cachedInputTokens pretty soon.

Understanding how much your application is using is obviously really important and the AI SDK makes this really, really simple to observe.

Nice work and I will see you in the next one.

Okay, the solution here is to await output.usage and log it to the console.

console.log(await output.usage)

We can see that output.usage has several different properties on it.

/*
cachedInputTokens?
inputTokens
outputTokens
reasoningTokens?
totalTokens
*/

We have the inputTokens. In other words, this is the number of tokens in our input, in our prompt, our system prompt, whatever.

Then we have the outputTokens. These are sometimes called completion tokens. This is the number of tokens that the LLM generated.

Some models also produce reasoningTokens. This is for quote-unquote thinking models and they may bill you differently for reasoning tokens or output tokens or input tokens.

Finally you've got the totalTokens used here too.

When we run this we can see that we get the streaming output down and then we get the usage logged here.

{
  inputTokens: 13,
  outputTokens: 127,
  totalTokens: 140,
  reasoningTokens: undefined,
  cachedInputTokens: undefined
}

By the way for those intrigued I'm going to talk about cachedInputTokens pretty soon.

Understanding how much your application is using is obviously really important and the AI SDK makes this really, really simple to observe.

Nice work and I will see you in the next one.

AI SDK Basics

LLM Fundamentals

Agents

Persistence

Context Engineering

Evals

Streaming

Agents and Workflows

Advanced Patterns

Reference

Usage Solution

AI SDK Basics

AI SDK Basics

LLM Fundamentals

LLM Fundamentals

Agents

Agents

Persistence

Persistence

Context Engineering

Context Engineering

Evals

Evals

Streaming

Streaming

Agents and Workflows

Agents and Workflows

Advanced Patterns

Advanced Patterns

Reference

Reference

Usage Solution

Video Transcript

Video Transcript