# Installing and Configuring PyTorch

A PyTorch installation must match three things: the Python environment, the operating system, and the available hardware. A CPU-only installation is enough for small examples and early chapters. GPU support becomes important once models, datasets, and training loops grow.

The goal of installation is not only to make `import torch` work. The goal is to produce a reproducible environment where code runs consistently, dependencies are isolated, and hardware acceleration is available when needed.

### Choosing an Environment

Use a virtual environment for each project. This prevents one project’s dependencies from breaking another project.

Common choices are:

| Tool | Typical use |
|---|---|
| `venv` | Simple Python standard-library environments |
| Conda | Python plus native libraries and CUDA packages |
| `uv` | Fast Python package management |
| Docker | Reproducible system-level environments |
| Cloud notebooks | Temporary managed environments |

For local learning, `venv`, Conda, or `uv` is enough. For production or shared research systems, Docker often gives better reproducibility.

A simple `venv` environment:

```bash
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
```

On Windows PowerShell:

```powershell
python -m venv .venv
.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
```

### Installing CPU PyTorch

A CPU-only installation works on most machines:

```bash
pip install torch torchvision torchaudio
```

This is sufficient for small tensor examples, automatic differentiation, linear models, and small neural networks.

After installation, check that PyTorch imports correctly:

```python
import torch

print(torch.__version__)
print(torch.cuda.is_available())
```

For a CPU-only installation, `torch.cuda.is_available()` should return `False`.

### Installing GPU PyTorch

For NVIDIA GPUs, PyTorch uses CUDA. The PyTorch package must be built for a compatible CUDA runtime.

A typical CUDA installation with pip looks like this:

```bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
```

The exact CUDA index may differ by PyTorch release and system. Use the official PyTorch installation selector when preparing a real machine.

After installation:

```python
import torch

print(torch.__version__)
print(torch.cuda.is_available())

if torch.cuda.is_available():
    print(torch.cuda.get_device_name(0))
```

A successful GPU installation should print `True` for CUDA availability and show the GPU name.

### Apple Silicon

On Apple Silicon, PyTorch can use the Metal Performance Shaders backend, called `mps`.

Check for it with:

```python
import torch

print(torch.backends.mps.is_available())
```

A simple device selector:

```python
device = (
    "cuda" if torch.cuda.is_available()
    else "mps" if torch.backends.mps.is_available()
    else "cpu"
)

print(device)
```

The `mps` backend is useful for local experimentation. Some operations may have different performance or support characteristics compared with CUDA.

### Verifying Tensor Operations

After installation, run a small tensor test:

```python
import torch

x = torch.randn(4, 3)
w = torch.randn(3, 2)

y = x @ w

print(y)
print(y.shape)
```

Then test gradients:

```python
x = torch.tensor(2.0, requires_grad=True)

y = x * x + 3 * x + 1
y.backward()

print(x.grad)
```

The gradient should be `tensor(7.)`.

If a GPU is available, test device execution:

```python
device = "cuda" if torch.cuda.is_available() else "cpu"

x = torch.randn(1024, 1024, device=device)
y = x @ x

print(y.device)
```

This confirms that tensor operations run on the selected device.

### Installing Common Libraries

Most PyTorch projects need more than the core package.

A practical learning environment:

```bash
pip install torch torchvision torchaudio
pip install numpy pandas matplotlib scikit-learn tqdm
```

For transformer and NLP work:

```bash
pip install transformers datasets tokenizers accelerate
```

For experiment tracking and configuration:

```bash
pip install tensorboard pyyaml rich
```

For notebooks:

```bash
pip install jupyter ipykernel
```

For graph neural networks, installation depends on the PyTorch and CUDA version. PyTorch Geometric should be installed using its official instructions because it may require version-specific wheels.

### Reproducibility

A reproducible project records its dependencies.

For a small pip project:

```bash
pip freeze > requirements.txt
```

Install later with:

```bash
pip install -r requirements.txt
```

A more controlled project can use `pyproject.toml`:

```toml
[project]
name = "deep-learning-pytorch"
version = "0.1.0"
requires-python = ">=3.10"
dependencies = [
    "torch",
    "torchvision",
    "torchaudio",
    "numpy",
    "matplotlib",
    "scikit-learn",
    "tqdm",
]
```

Dependency files matter because deep learning libraries change. A training script that works with one version may behave differently with another version.

### Random Seeds

Deep learning uses random numbers for initialization, shuffling, dropout, augmentation, and sampling. Set random seeds when you need repeatable runs.

```python
import random
import numpy as np
import torch

seed = 1234

random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)

if torch.cuda.is_available():
    torch.cuda.manual_seed_all(seed)
```

For stricter determinism:

```python
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
```

This can reduce performance. Full determinism is not always possible across all operations, devices, and library versions.

### Project Layout

A small PyTorch project can start with a few files:

```text
project/
  train.py
  model.py
  data.py
  eval.py
  requirements.txt
  README.md
```

A larger project benefits from a package layout:

```text
project/
  pyproject.toml
  src/
    dlbook/
      __init__.py
      data.py
      models.py
      train.py
      eval.py
  scripts/
    train_mnist.py
  configs/
    mnist.yaml
  checkpoints/
  runs/
  tests/
```

Keep generated files such as checkpoints and logs out of source control unless there is a specific reason to track them.

A typical `.gitignore` includes:

```gitignore
.venv/
__pycache__/
*.pyc
checkpoints/
runs/
data/
```

### Device Configuration in Code

Most examples in this book use a device variable:

```python
import torch

device = (
    "cuda" if torch.cuda.is_available()
    else "mps" if torch.backends.mps.is_available()
    else "cpu"
)
```

Move the model to the device:

```python
model = model.to(device)
```

Move batches inside the training loop:

```python
for x, y in loader:
    x = x.to(device)
    y = y.to(device)

    logits = model(x)
```

A common error is creating new tensors on the CPU inside the model while the input is on the GPU.

Poor pattern:

```python
bias = torch.zeros(x.shape[-1])
```

Better pattern:

```python
bias = torch.zeros(x.shape[-1], device=x.device, dtype=x.dtype)
```

New tensors created during model computation should usually inherit device and dtype from existing tensors.

### Mixed Precision Configuration

For CUDA training, mixed precision can improve speed and reduce memory usage.

A common pattern:

```python
scaler = torch.amp.GradScaler("cuda")

for x, y in loader:
    x = x.to("cuda")
    y = y.to("cuda")

    optimizer.zero_grad(set_to_none=True)

    with torch.amp.autocast("cuda"):
        logits = model(x)
        loss = loss_fn(logits, y)

    scaler.scale(loss).backward()
    scaler.step(optimizer)
    scaler.update()
```

Mixed precision should be introduced after a full-precision version works. It can change numerical behavior and make debugging harder.

### Basic Sanity Check Script

A useful installation check is a complete tiny training script.

```python
import torch
from torch import nn
from torch.utils.data import DataLoader, TensorDataset

device = (
    "cuda" if torch.cuda.is_available()
    else "mps" if torch.backends.mps.is_available()
    else "cpu"
)

x = torch.randn(1024, 20)
y = (x.sum(dim=1) > 0).long()

loader = DataLoader(TensorDataset(x, y), batch_size=64, shuffle=True)

model = nn.Sequential(
    nn.Linear(20, 64),
    nn.ReLU(),
    nn.Linear(64, 2),
).to(device)

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-3)

for epoch in range(5):
    total_loss = 0.0

    for xb, yb in loader:
        xb = xb.to(device)
        yb = yb.to(device)

        logits = model(xb)
        loss = loss_fn(logits, yb)

        optimizer.zero_grad(set_to_none=True)
        loss.backward()
        optimizer.step()

        total_loss += loss.item() * xb.size(0)

    avg_loss = total_loss / len(loader.dataset)
    print(f"epoch={epoch} loss={avg_loss:.4f}")
```

The loss should generally decrease. This verifies tensors, modules, data loading, autograd, optimization, and device placement.

### Common Installation Problems

| Symptom | Likely cause | Fix |
|---|---|---|
| `ModuleNotFoundError: torch` | PyTorch installed in different environment | Activate the correct environment |
| `torch.cuda.is_available()` is `False` | CPU build or driver mismatch | Install CUDA-compatible PyTorch |
| Device mismatch error | Model and data on different devices | Move both to the same device |
| Out-of-memory error | Batch or model too large | Reduce batch size |
| Import error for vision or audio | Version mismatch | Install matching package versions |
| Very slow training | Running on CPU unexpectedly | Print device and tensor locations |

Most setup problems are environment problems. Always print the Python path, PyTorch version, and device status when debugging.

```python
import sys
import torch

print(sys.executable)
print(torch.__version__)
print(torch.cuda.is_available())
```

### Working Style

A reliable workflow is:

1. Start with a clean environment.
2. Install PyTorch and core dependencies.
3. Verify tensor operations.
4. Verify gradients.
5. Verify device execution.
6. Run a tiny training script.
7. Add project-specific libraries.
8. Freeze or record dependencies.

This avoids debugging several layers of failure at once.

### Summary

Installing PyTorch means configuring Python, dependencies, and hardware support together. A CPU setup is enough for small examples. A GPU setup requires compatible PyTorch packages, drivers, and device placement.

A good project records its dependencies, isolates its environment, tests gradients, verifies device execution, and uses a consistent project layout. These habits reduce friction before the real work begins: building and training models.

