Dr.Balakrishna SG

I'm a Data story teller

Building your own neural network in python is super easy and fun. Let mw walk you through the steps.

Yes, there are thousands of posts websites which explains the theory of neural networks. Among them I found one post from Michael Nielsen where he clearly explains the theory behind neural-nets and encourages to build one. Thanks to Michael Nielsen for the great tutorial.

Implementation of Neural network from scratch

Firstly import all necessary libraries.

import numpy as np
from statistics import mean 
import pandas as pd
import numpy.random as r
from random import randrange
from tqdm import tqdm_notebook as tqdm
from matplotlib import pyplot as plt

We need three important function that are necessary for a ANN to work

def sigmoid(x):
    return 1/(1+np.exp(-x))

def sigmoid_prime(x):
    return sigmoid(x) * (1-sigmoid(x))

def cost(y_hat, y):
    return np.mean([_ * _ for _ in (y_hat - y)])

Lets initiate an empty network with desired amount of layers, input size and output size.

def init_net(inputSize, layers, outputSize):
    input_size = inputSize
    output_size = outputSize
    params , metadata ,theta = {},{},{}
    metadata["inputSize"] = input_size
    metadata["outputSize"] = output_size
    metadata["n_layers"] = layers
    params["metadata"] = metadata 
    for idx in range(0, layers):
        if idx!=layers-1:
            theta["W_" + str(idx)] = r.rand(input_size, input_size)
            theta["B_" + str(idx)] = np.array([r.rand(input_size)]).T
        else:
            theta["W_" + str(idx)] = r.rand(outputSize,input_size)
            theta["B_" + str(idx)] = np.array([r.rand(outputSize)]).T
    params["theta"] = theta
    return params

The function forwardPass() is takes input vector and does a linear transformation and finally passes through an activation sigmoid function. Lets define XOR data as an input (X) to the neural network and desired output labels (Y)

X = np.array([
    [0, 0],
    [0, 1],
    [1, 0],
    [1, 1],
])
y = np.atleast_2d([0, 1, 1, 0]).T

Now pass input data X into forwardPass() function and get result with all weights parameter into a variable “params”

params = forwardPass(X[1],y[1],params)

After a first forward pass we will get first prediction at the output of our neural network. Since the weights are randomly initialized, our prediction must be really far away from ground truth. the difference between the prediction and actual (is called error) is calulated and used to adjust the weights that we randomly assigned in the beginning.

Error= Actual Output – Desired Output

Backpropagation is the essence of neural network training. It is the method of fine-tuning the weights of a neural network based on the error rate obtained in the previous epoch (i.e., iteration). Proper tuning of the weights allows you to reduce error rates and make the model reliable by increasing its generalization.

Backpropagation in neural network is a short form for “backward propagation of errors.” It is a standard method of training artificial neural networks. This method helps calculate the gradient of a loss function with respect to all the weights in the network.

The Back propagation algorithm in neural network computes the gradient of the loss function for a single weight by the chain rule. It efficiently computes one layer at a time, unlike a native direct computation. It computes the gradient, but it does not define how the gradient is used. It generalizes the computation in the delta rule. (source).

def backPropogation(params,lr,update=True, debug=False):
    L = params["metadata"]["n_layers"]
    params["metadata"]["learningRate"] = lr
    finalLayer = True
    deltas = {}
    d_theta = {}
    for l in reversed(range(0,L)):
        if finalLayer:
            a_prev = params["forwardPass"]["A_prev"+str(l)].reshape(1,-1)
            delta = -params["forwardPass"]["error"]*sigmoid_prime(params["forwardPass"]["Z_"+str(l)])
            deltas["delta_"+str(l)] = delta
            dw = np.matmul(delta,a_prev)
            d_theta["d_theta_"+str(l)] = dw
            
            if update:
                params["theta"]["W_"+str(l)] -= params["metadata"]["learningRate"]*dw
                params["theta"]["B_"+str(l)] -= params["metadata"]["learningRate"]*delta
            if debug:
                print(delta, dw)
            finalLayer = False
        else:
            a_prev = params["forwardPass"]["A_prev"+str(l)].reshape(1,-1)
            w_next = params["theta"]["W_"+str(l+1)].T
            delta = np.matmul(w_next,delta)*sigmoid_prime(params["forwardPass"]["Z_"+str(l)])
            deltas["delta_"+str(l)] = delta
            dw = np.matmul(delta,a_prev)
            d_theta["d_theta_"+str(l)] = dw
            if update:
                params["theta"]["W_"+str(l)] -= params["metadata"]["learningRate"]*dw
                params["theta"]["B_"+str(l)] -= params["metadata"]["learningRate"]*delta
            if debug:
                print(delta)
    params["deltas"] = deltas
    params["d_theta"] = d_theta
    
    return params

Now we put all together and run forwardPass() and backPropogation() several cycles (also called epochs) and watch how the error is minimizing in other word how our neural network learns the data to accurately predict the out come for an input.

var = init_net(2, 2, 1) #A network with desire number of layers is initialized
epochs = 10000
epoch_error = []

for i in range(0,epochs):
    item = randrange(0,3)
    var = forwardPass(X[item],y[item],var)
    var = backPropogation(var,0.125, debug=False)
    c = 0.5*(var["forwardPass"]["error"])**2
    epoch_error.append(c)
    

Plot the error from each epoch.

One response

Leave a Reply

Your email address will not be published. Required fields are marked *