LLM Lab 1: Next-Token Arcade
See how an LLM predicts the very next token and why small prompt changes shift the whole response.
Why this matters
LLMs pick the next token over and over. This lab makes that visible so you see why wording matters.
Can You Guess?
You type: 'The capital of France is'. Which next token is the model most confident about?
How the model thinks
From prompt to next token
Follow the steps and watch each checkpoint light up as you progress.
Step 1: You provide context
The prompt sets the starting state: recent tokens and any system instructions.
Step 2: Tokens, not words
Text is broken into tokens. Short pieces like "Par" + "is" are more common than whole words.
Step 3: Probability stack
The model scores every possible next token. High-probability tokens float to the top.
Step 4: Sampling happens
One token is sampled (or the top one is picked). That new token is appended to the context.
Step 5: Repeat
The process loops until the model hits a stop signal or token limit.
Try this now
Feel the next-token loop
What to watch for
- Prefixes steer the next token more than you think.
- Every new token becomes part of the prompt for the next step.
- Deterministic decoding (top-1) versus sampling changes whether the model ever surprises you.
Key Takeaways
- ✓ LLMs generate one token at a time using probabilities.
- ✓ Your wording nudges which token is most likely.
- ✓ Small changes to the prefix can cascade through the whole answer.
Quick poll
What brought you to this free lesson?
Lesson: LLM Lab 1: Next-Token Arcade