LLM Lab 1: Next-Token Arcade

Why this matters

LLMs pick the next token over and over. This lab makes that visible so you see why wording matters.

You type: 'The capital of France is'. Which next token is the model most confident about?

Follow the steps and watch each checkpoint light up as you progress.

The prompt sets the starting state: recent tokens and any system instructions.

Text is broken into tokens. Short pieces like "Par" + "is" are more common than whole words.

The model scores every possible next token. High-probability tokens float to the top.

One token is sampled (or the top one is picked). That new token is appended to the context.

The process loops until the model hits a stop signal or token limit.

Open your favorite chat model.

Send: Continue this list: apple, banana,

Then try: Continue this list of colors: apple, banana,

Compare how a tiny prefix change moves the next token.

Prefixes steer the next token more than you think.
Every new token becomes part of the prompt for the next step.
Deterministic decoding (top-1) versus sampling changes whether the model ever surprises you.