Neural Networks With Minibatch Stochastic Gradient Descent and Adaptive Moments (ADAM)
In this notebook, we will walk through the design, training, and testing of neural networks with multiple hidden layers using minibatch stochastic gradient descent with adaptive moments (ADAM). These neural networks will be used for logistic regression, which is an archaic name for binary classification.
The binary classification will be performed on images of handwritten numerical digits. More specifically, the last numerical digit of my student ID. This digit happens to be 9. Therefore, the goal of our neural networks will be to output a value of 1 when the handwritten numerical digit image input is a 9, and 0 in all other cases.
The data set we will be using is the MNIST data set. This is a very popular data set among the machine learning community. The data set contains 60,000 images, and each image contains a handwritten numerical digit. Each of the images has been provided with a truth label that corresponds to the handwritten digit within the image from the set {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.
In our case, we only care about when the image is 9. Therefore we will need to re-label the truth labels so that all truth labels with the value of 9 are given the value of 1, and all other truth labels are given the value of 0.