LLM fundamentals · AI / ML · Code with Animation

What is an LLM?

A large language model predicts the next token — roughly, the next chunk of text — given everything before it, one token at a time. Trained on vast text, it learns patterns of language and knowledge well enough to answer, summarize, and write. Understanding tokens and context is the key to using it well.

Why it matters

LLMs are reshaping software, and "AI engineer" roles increasingly mean building on them. Even classical ML practitioners need to understand them. Knowing how they actually work — prediction over tokens within a context window — dispels the magic and tells you why they hallucinate, forget, and cost what they do.

What to learn

Tokens and tokenization
Next-token prediction as the core mechanism
The context window and its limits
Temperature and sampling
Why models hallucinate
The transformer at a high level
Capabilities versus reliability

Common pitfall

Treating an LLM as a database of facts. It generates plausible text, not verified truth, so it will state wrong things with total confidence — hallucination. Use LLMs for language tasks and reasoning over provided context, verify any factual claim, and ground them in real data (the RAG node) when accuracy matters.

Resources

Primary (free):

Andrej Karpathy — Intro to LLMs · video
Hugging Face — LLM course · course
Jay Alammar — The illustrated transformer · article

Practice

Use a tokenizer tool to see how a sentence breaks into tokens, then call an LLM API with the same prompt at two different temperatures and compare the outputs. Deliberately ask something it is likely to get wrong and observe a confident hallucination. Done when you can explain tokens, context, and why it hallucinated.

Outcomes

Explain tokens, next-token prediction, and context windows.
Adjust temperature and sampling for a task.
Explain why LLMs hallucinate.
Decide when an LLM is and is not the right tool.