MNIST ===== Source: ``examples/mnist/main.py`` Overview -------- A small convolutional network trained on the MNIST handwritten-digit dataset. The example shows how to use ``nn.Conv2d`` and how to build a custom ``Module`` that mixes convolutions with a final linear classifier. Running ------- :: python -m examples.mnist.main # downloads data on first run python -m examples.mnist.main --help # see all options Selected options: - ``--num-steps`` — training steps (default 500) - ``--batch-size`` — mini-batch size (default 32) - ``--hidden-dim`` — number of conv channels (default 32) - ``--step-size`` — AdamW learning rate (default 1e-3) Code walkthrough ---------------- **Dataset** MNIST images are downloaded automatically on first run and cached under ``examples/mnist/data/``:: train_dataset = MNIST(split="train") # 60 000 images, 28×28 greyscale test_dataset = MNIST(split="test") # 10 000 images **Model** Two convolutional layers followed by a linear output head:: class MNISTClassifier(nn.Module): def __init__(self, input_shape, num_classes, hidden_dim): super().__init__() self.conv1 = nn.Conv2d(1, hidden_dim, kernel_size=3, stride=1, padding=1) self.conv2 = nn.Conv2d(hidden_dim, hidden_dim, kernel_size=3, stride=1, padding=1) self.linear_out = nn.Linear(hidden_dim * H * W, num_classes) def forward(self, x): x = npg.relu(self.conv1(x)) # (N, hidden, 28, 28) x = npg.relu(self.conv2(x)) # (N, hidden, 28, 28) x = x.reshape(x.shape[0], -1) # (N, hidden*28*28) return self.linear_out(x) # (N, 10) **Training loop** :: optimizer = npg.optim.AdamW(net.parameters(), lr=1e-3) for step in range(num_steps): x, y = next(iter(dataloader)) optimizer.zero_grad() loss = nn.cross_entropy_loss(net(x), y) loss.backward() optimizer.step()