Chapter 10

Writes › Book › Deep Learning with PyTorch › Part III › Chapter 10 ›

Parameter Initialization

A neural network begins training with parameters that have not yet been learned from data.

Writes › Book › Deep Learning with PyTorch › Part III › Chapter 10 ›

Vanishing and Exploding Gradients

Deep networks train by sending information in two directions.

Writes › Book › Deep Learning with PyTorch › Part III › Chapter 10 ›

Batch Normalization

Batch normalization is a layer that normalizes activations using statistics computed from a mini-batch.

Writes › Book › Deep Learning with PyTorch › Part III › Chapter 10 ›

Layer Normalization

Layer normalization is a normalization method that normalizes features within each individual example.

Writes › Book › Deep Learning with PyTorch › Part III › Chapter 10 ›

Group and Instance Normalization

Batch normalization and layer normalization are the two most common normalization layers, but they do not cover every setting well.

Writes › Book › Deep Learning with PyTorch › Part III › Chapter 10 ›

Residual connections allow a layer or block to add its input directly to its output. Instead of forcing a block to learn a complete transformation from scratch, the block learns a correction to the input.

Writes › Book › Deep Learning with PyTorch › Part III › Chapter 10 ›

Stable Training in Deep Networks

Stable training means that a model can make steady progress without numerical collapse, uncontrolled gradients, or large oscillations in the loss.

Sections

Parameter Initialization

Vanishing and Exploding Gradients

Batch Normalization

Layer Normalization

Group and Instance Normalization

Residual Connections

Stable Training in Deep Networks