Facebook Made The Best Tool For Creating Neural Networks

It’s called PyTorch. And it’s a tool designed to work perfectly with Python libraries like NumPy and Jupyter Notebooks to create deep neural networks. As it turns out, it is much easier to use and more intuitive than Google’s TensorFlow packages. In fact, I have been trying to get TensorFlow working on my Mac laptop for about a month, each time I run it, I get a new error, and when I fix that error, I encounter another, and another, until I eventually resign myself to never being able to train a neural network on my laptop.

Fortunately, compatibility is not the only thing that PyTorch has going for it. After its release in 2017, it has been adopted by teams around the world in research and in business. It is extremely intuitive to use (for a high-level programming language targeted mostly at people with PhDs in Computer Science and Mathematics, admittedly). But seriously, it is designed with the structure of neural networks in mind, so the syntax and structure of your code can match the logical flow and linear algebra model that a neural network has conceptually.

A neural network with two hidden layers, as shown above, can be coded in PyTorch with less than 10 lines of code. Quite impressive.

All the squishification functions are built in to the PyTorch library, like

  • Sigmoid: S(x) = 1/(1+e^{-x}) ,
  • ReLU: f(x) = max(x,0) , and
  • Softmax: \sigma(z)_j = {e^{Z_j}}/{\sum_{k=1}^K*e^{Z_k}} .

On top of that, you can define a 756 bit input multi-dimensional matrix (AKA ‘Tensor‘) with one line of code.

Here’s the code for the above neural network that I created. It takes a 784-bit image file, pumps it through two hidden layers and then to a 10-bit output where each one of the output nodes represents a digit (0-9). The images that are in the training set are all images of handwritten numbers between zero and nine, so, when trained, this neural network should be able to identify the number that was written automatically.

Jupyter notebook code for a neural network with 784 input bits, two hidden layers and a 10 bit output

This few lines of code, executed on a server (or local host) produces the following output:

The neural network trying to classify a handwritten 5, notice that the probability distribution is fairly even, that’s because we haven’t trained the network yet.

See how cool that is!? Oh… right. The network seems to have no clue which number it is. That’s because all we’ve done so far is a feedforward operation on an untrained neural network with a bunch of random numbers as the weights and zeroes as the biases.

In order to make this neural network do anything useful, we have to train it, and that involves another step, back-propagation. I’ll cover that in my next blog. For now, we’re left with a useless random distribution of numbers and weights and no idea what our output should be, enjoy!