Free 🧰 Practical LLM Lessons

LLM Lab 1: Next-Token Arcade

See how an LLM predicts the very next token and why small prompt changes shift the whole response.

Published January 8, 2026

Why this matters

LLMs pick the next token over and over. This lab makes that visible so you see why wording matters.

Can You Guess?

You type: 'The capital of France is'. Which next token is the model most confident about?

How the model thinks

From prompt to next token

Follow the steps and watch each checkpoint light up as you progress.

1

Step 1: You provide context

The prompt sets the starting state: recent tokens and any system instructions.

2

Step 2: Tokens, not words

Text is broken into tokens. Short pieces like "Par" + "is" are more common than whole words.

3

Step 3: Probability stack

The model scores every possible next token. High-probability tokens float to the top.

4

Step 4: Sampling happens

One token is sampled (or the top one is picked). That new token is appended to the context.

5

Step 5: Repeat

The process loops until the model hits a stop signal or token limit.

Try this now

Feel the next-token loop

  • Open your favorite chat model.
  • Send: Continue this list: apple, banana,
  • Then try: Continue this list of colors: apple, banana,
  • Compare how a tiny prefix change moves the next token.
  • What to watch for

    • Prefixes steer the next token more than you think.
    • Every new token becomes part of the prompt for the next step.
    • Deterministic decoding (top-1) versus sampling changes whether the model ever surprises you.

    Key Takeaways

    • ✓ LLMs generate one token at a time using probabilities.
    • ✓ Your wording nudges which token is most likely.
    • ✓ Small changes to the prefix can cascade through the whole answer.

    Quick poll

    What brought you to this free lesson?

    Lesson: LLM Lab 1: Next-Token Arcade