Unleashing the Revolutionary Power of AI in Data Entry and Processing: Anticipating Unprecedented Advances Before 2025

Data entry and processing is one of the key areas where Artificial Intelligence (AI) is expected to have a major impact in the coming years. With the increasing amount of data being generated every day, the demand for faster and more efficient data processing has never been higher. Fortunately, AI technology is here to help meet this demand and take data entry and processing to the next level.

One of the main advantages of AI in data processing is its ability to automate manual data entry. This means that instead of relying on human data entry clerks, AI algorithms can process and categorize vast amounts of data much more efficiently and accurately. AI algorithms can also identify patterns and relationships within the data, allowing for more comprehensive data analysis.

Another key area where AI is expected to enhance data entry and processing is in natural language processing (NLP). NLP is a subfield of AI that focuses on the interactions between computers and humans in natural language. With advancements in NLP, AI will soon be able to understand and interpret written and spoken human language, making data entry and processing even more seamless.

Before 2025, we can expect to see significant advancements in AI’s ability to process and analyze unstructured data, such as images, videos, and audio. AI algorithms will be able to automatically identify and categorize information within these types of data, making data entry and processing much easier and more efficient. Additionally, AI will be able to process multiple languages, further expanding its reach and impact on data entry and processing.

Another exciting development in the field of AI and data entry and processing is the use of machine learning. Machine learning is a type of AI that allows algorithms to learn and improve over time through experience. With machine learning, AI algorithms can become more accurate and efficient at processing and analyzing data, reducing the risk of human error and improving the overall accuracy of the data.

In conclusion, the next few years will bring significant advancements in the field of AI and data entry and processing. From automating manual data entry to processing unstructured data and utilizing machine learning, AI has the potential to greatly enhance the accuracy and efficiency of data processing. By embracing these changes, we can look forward to a future where data entry and processing is seamless and accurate, providing valuable insights and helping organizations make better data-driven decisions.

Turning Your Selfie Into a DaVinci

Transfer learning. It’s a branch of AI that allows for the style transfer from one image to another. It seems like a straightforward concept: take my selfie and make it look like a Michelangelo painting. However, it is a fairly recent innovation in Deep Neural Networks that has allowed us to separate the content of an image from its style. And in doing so, to combine multiple images in ways that were previously impossible. For example, taking a long-dead artist’s style and applying it to your weekend selfie.

Just to prove that this is pretty cool, I’m going to take my newly built style transfer algorithm and apply it to a ‘selfie’ of my good dog, Lawrence. Here’s the original:

And here’s the image that I’m going to apply the style of:

That’s right, it’s Davinci’s Mona Lisa, one of the most iconic paintings of all time. I’m going to use machine learning to apply Davinci’s characteristic style to my iPhone X photo of my, admittedly very handsome, pupper.

If you’re interested, here’s a link to the original paper describing how to use Convolutional Neural Networks or CNNs to accomplish image style transfer. It’s written in relatively understandable language for such a technical paper so I do recommend you check it out, given you’re already reading a fairly technical blog.

So what is image content and style and how can we separate out the two? Well, neural networks are built in many layers, and the way it works out, some of the layers end up being responsible for detecting shapes and lines, as well as the arrangement of objects. These layers are responsible for understanding the ‘content’ of an image. Other layers, further down in the network are responsible for the style, colors and textures

Here’s the final result next to the original.

Pretty striking, if I do say so myself.

Using a pre-trained Neural Network called VGG19 and a few lines of my own code to pull the figures and what’s called a Gram Matrix I choose my style weights (how much I want each layer to apply). Then using a simple loss function to push us in the right direction we apply the usual gradient descent algorithm and poof. Lawrence is forever immortalized as a Davinci masterpiece.

Impressed? Not Impressed? Let me know in the comments below. If you have anything to add, or you think I could do better please chime in! This is a learning process for me and I’m just excited to share my newfound knowledge.

Here’s a link to my code in a Google Colab Notebook if you want to try it out for yourself!

I Built a Neural Net That Knows What Clothes You’re Wearing

Okay, maybe that is a bit of a click-baitey headline. What I really did was program a neural network with Pytorch that is able to distinguish between ten different clothing items that could present in a 28×28 image. To me, that’s still pretty cool.

Here’s an example of one of the images that gets fed into the program:

Yes, this is an image of a shirt.

Can you tell what this is? Looks kind of like a long-sleeve t-shirt to me, but it is so pixelated that I can’t really tell. But that doesn’t matter. What matters is what my trained neural-net thinks it is and if that’s what it actually is.

After training on a subset of images like this (the training set is about 750 images) for about 2 minutes, my model was able to choose the correct classification for any image that I fed in about 84.3% of the time. Not bad for a first go at building a clothing classifying deep neural net.

Below I have included the code that actually generates the network and runs a forward-pass through it:


class Network(nn.Module):
    def __init__(self, input_size, output_size, hidden_layers, drop_p=0.5):
        ''' Builds a feedforward network with arbitrary hidden layers.
       
            Arguments
            ---------
            input_size: integer, size of the input
            output_size: integer, size of the output layer
            hidden_layers: list of integers, the sizes of the hidden layers
            drop_p: float between 0 and 1, dropout probability
        '''
        super().__init__()
        # Add the first layer, input to a hidden layer
        self.hidden_layers = nn.ModuleList([nn.Linear(input_size, hidden_layers[0])])
       
        # Add a variable number of more hidden layers
        layer_sizes = zip(hidden_layers[:-1], hidden_layers[1:])
        self.hidden_layers.extend([nn.Linear(h1, h2) for h1, h2 in layer_sizes])
       
        self.output = nn.Linear(hidden_layers[-1], output_size)
       
        self.dropout = nn.Dropout(p=drop_p)
       
    def forward(self, x):
        ''' Forward pass through the network, returns the output logits '''
       
        # Forward through each layer in `hidden_layers`, with ReLU activation and dropout
        for linear in self.hidden_layers:
            x = F.relu(linear(x))
            x = self.dropout(x)
       
        x = self.output(x)
       
        return F.log_softmax(x, dim=1)

After training the network using a method called backpropagation and gradient descent (code below), the network successfully classified the vast majority of the images that I fed in, in less than half a second. Mind you, these were grayscale images, formatted in a simple way and trained with a large enough dataset to ensure reliability.

If you want a good resource to explain what backpropagation actually does, check out another great video by 3 Blue 1 Brown below:

So, what does this all look like? Is it all sci-fi futuristic and with lots of beeps and boops? Well… not exactly. Here’s the output of the program:

Output of my clothing-classifier neural net. Provides a probability that the photo is one of the 10 items listed.

The software grabs each image in the test set, runs it through a forward pass of the network and ends up spitting out a probability for each image. Above, you can see that the network thinks that this image is likely a coat. I personally can’t distinguish if it is a coat, a pullover or just a long-sleeve shirt, but the software seems about 85% confident that it is, in fact, a coat.

Overall, it’s pretty awesome that after only a few weeks of practice (with most of that time spent learning how to program in python) I can code my very own neural networks and they actually work!

If you’re interested, here’s a video of the neural network training itself and running through a few test images:

If you’d like to test out the code for yourself, here’s a link to my GitHub page where you can download all the files you need to get it running. Search Google if you can’t figure out how to install Python and run a Jupyter Notebook.

That’s all for now! See you soon 🙂

What AI Does Well

AI has become extremely adept at giving suggestions, sorting through huge volumes of data and providing summaries. Right now, I can log onto Google Photos and type any word I want and Google’s image classification algorithm will find me photos that contain whatever I search for. For example, I’m considering selling my 2013 VW Tiguan in order to help pay for another corporate vehicle (that happens to be a Tesla). Anyways, I typed Tiguan into the search bar on Google Photos to find images of the car that I could post online. Sure enough, every photo that I’ve ever taken of my car popped right up, and some photos showed up that had other people’s Tiguans in the background. I have around ten-thousand photos in my library, so finding those few is quite an impressive feat and would have been much more difficult had I tried to do it manually.

Some of the images Google’s AI found for me when I searched the word Tiguan

Most of the improvements in AI over the last 5-15 years have come from developments in a type of machine learning software called deep neural networks. They’re called neural networks because they form analogous structures to the human brain.

Basically, they’re a huge array of neurons (input neurons, output neurons and hidden neurons) connected by lines that represent weights. The connections between the neurons form matrices that modify the subsequent layers of the neural network. It all looks something like this:

Simplified neural network with only one hidden layer – Courtesy Udacity

Typically, deep neural networks have multiple hidden layers (it’s why they’re called ‘deep’ neural networks). What happens in the hidden layers is obstructed from view and it isn’t always obvious what each of the hidden layers is doing. Generally, the hidden layers are performing a simple matrix operation on the input values, the result, weighted by the lines (scalars) connecting the layers, is eventually passed to the output layer. The goal of an image classifier, for example, is to take an input, let’s say an image of a cat, and then produce an output, the word cat. Pretty simple, right?

Well, it kind of is. As long as you know what the input is and what the output should be it is relatively straightforward to ‘train’ a neural network to understand what weights to assign in order to transform a picture of a cat into the word cat. The problem arises when the network encounters something that it didn’t train for. Of course, if all the network has ever seen are picture of cats, if we feed it an image of something else, say, a mouse, the network might be able to tell you it’s not a cat, if it was trained with enough data, but more likely it will just think it’s a weird looking cat. If the network gets constantly rewarded by identifying everything as a cat, it’s probably going to think something is a cat when it sees it.

A neural network acts like a linear function that divides a boundary, in this case, cat vs not cat. Having a neural network with multiple layers allows the lines that can be drawn to be ‘curvier’ and include more cats and fewer dogs.

This is why having a large enough training and testing datasets is critical for neural networks. Neural networks need to train on large quantities of data. Google has billions (perhaps trillions) of photos stored in their servers, so they’ve been able to train their neural networks to be incredibly efficient at determining what is in an image.

In general, problems where there is a large enough training dataset and both the input and the answer are known for the training set are fairly tractable for AI programs today. One task that is generally more difficult for today’s AI software is explaining how and why it got the answer it did. Luckily, researchers and businesses are hard at work solving this problem. Hopefully soon, Google Photos will be able to not only show us images of all the cats in our photo library, but also be able to tell us why they’re so cute and yet so cold all at the same time.

‘Blackbox’ AI happens when a system can provide the correct answer but gives no indication of how it arrived at the solution.