Transfer learning · AI / ML · Code with Animation

What is transfer learning?

Transfer learning reuses a model already trained on a huge dataset and adapts it to your smaller task. Instead of training from scratch — which needs enormous data and compute — you take the pretrained model's learned features and fine-tune or extend them for your problem.

Why it matters

Almost nobody trains large models from zero; it is too expensive. Transfer learning is how practitioners get strong results with modest data and budgets, and it is the bridge to working with pretrained language models in the next stage. It is the default approach for most real deep learning.

What to learn

Pretrained models and what they have learned
Feature extraction: freeze the base, train a new head
Fine-tuning: unfreezing and training gently
Choosing a learning rate for fine-tuning
Model hubs like Hugging Face
When transfer learning helps and when it does not
Avoiding catastrophic forgetting

Common pitfall

Fine-tuning the whole pretrained model with a high learning rate, which destroys the valuable features it already learned — catastrophic forgetting. Start by training only a new head on top, and if you do fine-tune the base, use a much smaller learning rate so you nudge the features rather than wiping them.

Resources

Primary (free):

PyTorch — Transfer learning tutorial · docs
Hugging Face — Models · docs
CS231n — Transfer learning · docs

Practice

Take a pretrained model and adapt it to a small dataset two ways: first as a feature extractor with a frozen base and a new head, then by fine-tuning the base with a low learning rate. Compare results. Done when you can explain why a high learning rate on the base would have hurt.

Outcomes

Explain why transfer learning beats training from scratch.
Use a pretrained model as a frozen feature extractor.
Fine-tune gently with a low learning rate.
Avoid catastrophic forgetting of pretrained features.