Custom Modules ============== All trainable components in numpygrad inherit from ``nn.Module``. Subclassing it is the standard way to define new layers, loss functions, or complete architectures. Minimal example --------------- Override ``forward`` with the computation your module performs:: import numpygrad as npg import numpygrad.nn as nn class Affine(nn.Module): def __init__(self, in_features: int, out_features: int) -> None: super().__init__() self.weight = nn.Parameter(npg.random.randn((in_features, out_features))) self.bias = nn.Parameter(npg.zeros((out_features,))) def forward(self, x: npg.array) -> npg.array: return x @ self.weight + self.bias layer = Affine(4, 8) out = layer(npg.random.randn((2, 4))) # shape (2, 8) Any attribute assigned as a ``Parameter`` is automatically included in ``module.parameters()`` and therefore in the optimizer's update step. Composing modules ----------------- Assign child modules as attributes and they are tracked recursively:: class TwoLayer(nn.Module): def __init__(self, dim: int) -> None: super().__init__() self.fc1 = Affine(dim, dim) self.fc2 = Affine(dim, dim) def forward(self, x: npg.array) -> npg.array: return self.fc2(npg.relu(self.fc1(x))) net = TwoLayer(16) print(len(list(net.parameters()))) # 4 — weight + bias for each layer ``parameters()`` walks the full module tree recursively, so you can nest modules arbitrarily deep. Using ``Sequential`` -------------------- For a simple chain of modules, ``nn.Sequential`` avoids boilerplate:: model = nn.Sequential( nn.Linear(4, 32), nn.ReLU(), nn.Linear(32, 2), ) out = model(x) # applies each module in order Buffers ------- If you need a non-trainable array stored on the module (e.g. a running mean), assign it as a plain ``Array`` — it will not appear in ``parameters()`` but is still accessible as an attribute:: class BatchNorm1d(nn.Module): def __init__(self, num_features: int) -> None: super().__init__() self.scale = nn.Parameter(npg.ones((num_features,))) self.shift = nn.Parameter(npg.zeros((num_features,))) self.running_mean = npg.zeros((num_features,)) # not a Parameter def forward(self, x: npg.array) -> npg.array: mean = x.mean(axis=0) self.running_mean = 0.9 * self.running_mean + 0.1 * mean x_norm = (x - mean) / (x.var(axis=0) ** 0.5 + 1e-5) return self.scale * x_norm + self.shift Inspecting parameters --------------------- ``state_dict()`` returns a flat ``dict`` mapping parameter names to their underlying NumPy arrays — useful for checkpointing:: sd = model.state_dict() # {'fc1.weight': array(...), 'fc1.bias': array(...), ...} import numpy as np np.savez("checkpoint.npz", **sd)