A Simple Neural Network Classifier using PyTorch, from Scratch (2024)

Jeril Kuriakose

A Simple Neural Network Classifier using PyTorch, from Scratch (3)

In this article we will buld a simple neural network classifier model using PyTorch. In this article we will cover the following:

Step 1: Generate and split the data
Step 2: Processing generated data
Step 3: Build neural network classifier from scratch
Step 4: Training the neural network classifier
Step 5: Saving the trained model
Step 6: Loading the saved model
Step 7: Testing the trained model

PyTorch v1.10.0
Scikit-learn v1.0.2
Numpy v1.19.5

Lets make or generate our classification dataset using Scikit-learn

from sklearn.datasets import make_classificationX, Y = make_classification(
 n_samples=100, n_features=4, n_redundant=0,
 n_informative=3, n_clusters_per_class=2, n_classes=3
)

We generate only very few samples 100 , this can be increased by changing the n_samples parameter

Next let’s split the data into training and testing. 33 % of the data is used for testing.

from sklearn.model_selection import train_test_splitX_train, X_test, Y_train, Y_test = train_test_split(
 X, Y, test_size=0.33, random_state=42)

Once after getting the training and testing dataset, we process the data using PyTorch Dataset and DataLoader . Dataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset to enable easy access to the samples.

import numpy as np
import torch
from torch.utils.data import Dataset, DataLoaderclass Data(Dataset): def __init__(self, X_train, y_train):
 # need to convert float64 to float32 else 
 # will get the following error
 # RuntimeError: expected scalar type Double but found Float
 self.X = torch.from_numpy(X_train.astype(np.float32))
 # need to convert float64 to Long else 
 # will get the following error
 # RuntimeError: expected scalar type Long but found Float
 self.y = torch.from_numpy(y_train).type(torch.LongTensor)
 self.len = self.X.shape[0] def __getitem__(self, index):
 return self.X[index], self.y[index] def __len__(self):
 return self.len

We created a classes inheriting the properties of torch.utils.data.Dataset . The training data is then passed as the following:

traindata = Data(X_train, Y_train)

Now the training data can be easily accessed using index:

print(traindata[25])
'''
# Output
(tensor([-0.9528, 1.6890, -0.6810, 0.7165]), tensor(1))
'''

We can also slice the training data as follows:

print(traindata[25:34])
'''
# Output
(tensor([[-0.9528, 1.6890, -0.6810, 0.7165],
 [ 0.6994, 4.5166, 0.5078, -2.0575],
 [ 0.8508, 1.6109, 0.3014, 0.9455],
 [ 1.1293, -0.8988, 1.6426, -0.0171],
 [-0.2316, 1.9337, -0.9727, -0.1864],
 [-1.0156, 1.1438, -0.0883, 0.6976],
 [ 1.2509, -1.6992, 1.8562, -1.7159],
 [ 1.1714, 0.9062, -1.5627, -0.5184],
 [-1.1780, -2.7274, -1.0570, 1.9610]]),
 tensor([1, 2, 0, 0, 1, 1, 0, 0, 1]))
'''

Next we load the trainingdata using the DataLoader , we set batch_size to 4.

batch_size = 4
trainloader = DataLoader(traindata, batch_size=batch_size, 
 shuffle=True, num_workers=2)

Now lets build our neural network classifier

import torch.nn as nn# number of features (len of X cols)
input_dim = 4
# number of hidden layers
hidden_layers = 25
# number of classes (unique of y)
output_dim = 3class Network(nn.Module):
 def __init__(self):
 super(Network, self).__init__()
 self.linear1 = nn.Linear(input_dim, hidden_layers)
 self.linear2 = nn.Linear(hidden_layers, output_dim) def forward(self, x):
 x = torch.sigmoid(self.linear1(x))
 x = self.linear2(x)
 return x

We can initilize the classifier by just invoking it:

clf = Network()

We can also list the network parameters as the following:

print(clf.parameters)
'''
# Output
<bound method Module.parameters of Network(
 (linear1): Linear(in_features=4, out_features=25, bias=True)
 (linear2): Linear(in_features=25, out_features=3, bias=True)
)>
'''

Next lets define our loss function and the optimizer

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(clf.parameters(), lr=0.1)

Now we are all set for our training, let code our training :

epochs = 2
for epoch in range(epochs):
 running_loss = 0.0
 for i, data in enumerate(trainloader, 0):
 inputs, labels = data
 # set optimizer to zero grad to remove previous epoch gradients
 optimizer.zero_grad()
 # forward propagation
 outputs = clf(inputs)
 loss = criterion(outputs, labels)
 # backward propagation
 loss.backward()
 # optimize
 optimizer.step()
 running_loss += loss.item()
 # display statistics
 print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.5f}')

For demonstation purpose I am training for 2 epochs, it can be changed as required. The output will look like the following:

[1, 17] loss: 0.00522
[2, 17] loss: 0.00508

Now lets save our trained model:

# save the trained model
PATH = './mymodel.pth'
torch.save(clf.state_dict(), PATH)

The locally saved model can be then loaded for inference, using the following:

clf = Network()
clf.load_state_dict(torch.load(PATH))
'''
# Output
<All keys matched successfully>
'''

Once the model is loaded, we can test our trained model. Lets test for a single mini-batch.

testdata = Data(X_test, Y_test)
testloader = DataLoader(testdata, batch_size=batch_size, 
 shuffle=True, num_workers=2)

Get a single mini-batch from the DataLoader

dataiter = iter(testloader)
inputs, labels = dataiter.next()

The test inputs will look like the following:

print(inputs)
'''
# Output
tensor([[ 1.6876, -1.2382, 1.5971, -2.2628],
 [ 0.7683, 0.3534, 0.0460, -1.2109],
 [-1.0097, 1.1584, -0.0593, 0.7738],
 [ 1.7332, 0.1764, 0.5259, -2.3073]])
'''

The test labels will look like the following:

print(labels)
'''
# Output
tensor([0, 2, 1, 0])
'''

Now lets do the inference

outputs = clf(inputs)
__, predicted = torch.max(outputs, 1)
print(predicted)
'''
# Output
tensor([0, 0, 1, 0])
'''

Looks like our code is working as expected, lets do the inference for the entire test dataset.

correct, total = 0, 0
# no need to calculate gradients during inference
with torch.no_grad():
 for data in testloader:
 inputs, labels = data
 # calculate output by running through the network
 outputs = clf(inputs)
 # get the predictions
 __, predicted = torch.max(outputs.data, 1)
 # update results
 total += labels.size(0)
 correct += (predicted == labels).sum().item()
print(f'Accuracy of the network on the {len(testdata)} test data: {100 * correct // total} %')'''
# Output
Accuracy of the network on the 33 test data: 75 %
'''

The model can be further changed to improve the accuracy.

Happy Coding !!!

A Simple Neural Network Classifier using PyTorch, from Scratch (2024)

References