Why PyTorch
PyTorch is the framework I use to actually build and train neural networks. The mental model that helped most: a PyTorch tensor is like a NumPy array, but with two superpowers - it can live on a GPU, and it can track the operations done to it so gradients can be computed automatically.
Tensors
import torch
x = torch.tensor([1.0, 2.0, 3.0])
y = torch.zeros((2, 3))
z = torch.randn((3, 3)) # random normal
x.shape # torch.Size([3])
x.to("cuda") # move to GPU (if available)
Most NumPy habits carry straight over - indexing, reshaping with .view(...),
broadcasting. That overlap made PyTorch much less intimidating.
Autograd: gradients for free
This is the magic. If a tensor has requires_grad=True, PyTorch records every
operation into a graph, and calling .backward() computes the gradient of the
output with respect to every input.
w = torch.tensor(2.0, requires_grad=True)
loss = (w - 5) ** 2 # some function of w
loss.backward() # compute d(loss)/dw
print(w.grad) # tensor(-6.) -> slope at w=2
I never have to derive the backward pass by hand - autograd does the calculus.
nn.Module: packaging a model
Models are Python classes that subclass nn.Module. You declare the layers in
__init__ and describe the data flow in forward.
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(784, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = F.relu(self.fc1(x))
return self.fc2(x)
The training loop: five steps, always the same
Once this clicked, every PyTorch project looked familiar. Each batch repeats:
optimizer.zero_grad() # 1. clear old gradients
output = model(data) # 2. forward pass
loss = loss_fn(output, target) # 3. measure the error
loss.backward() # 4. backpropagate (autograd)
optimizer.step() # 5. update the weights
zero_grad → forward → loss → backward → step. That rhythm is the heart of
training any model, from a tiny MNIST classifier to something huge.
Where this connects
This is exactly what I used in my handwriting recognition project
- the same tensors,
nn.Module, and five-step loop, just with a CNN.
A living note - I expand it as I learn more of PyTorch.