Chapter 23

Writes › Book › Deep Learning with PyTorch › Part VII › Chapter 23 ›

Pretraining Objectives

A large language model is trained in two broad phases. The first phase is pretraining.

Writes › Book › Deep Learning with PyTorch › Part VII › Chapter 23 ›

Scaling Laws for Language Models

Scaling laws describe how model performance changes as we increase compute, parameter count, dataset size, and training tokens.

Writes › Book › Deep Learning with PyTorch › Part VII › Chapter 23 ›

Instruction Tuning

Pretraining teaches a language model to predict text. It does not directly teach the model to follow user instructions, answer safely, maintain dialogue structure, or format outputs in a useful way.

Writes › Book › Deep Learning with PyTorch › Part VII › Chapter 23 ›

Reinforcement Learning from Human Feedback

Instruction tuning teaches a model to imitate demonstrations.

Writes › Book › Deep Learning with PyTorch › Part VII › Chapter 23 ›

Constitutional Alignment

Reinforcement learning from human feedback improves model behavior using preference data. However, collecting large amounts of human feedback is expensive, slow, and difficult to scale consistently.