Real Life Is Not Like Billions

Bobby Axelrod, the main character on the popular Finance drama, Billions, is a lot like Tesla CEO Elon Musk. They’re both billionaires. They both draw substantial public praise and criticism and are highly divisive figures who have a large impact on their respective industries. They were also both investigated and charged by the SEC (and in Axelrod’s case, the US Justice Department) for actions related to securities law. The main difference between the two? Bobby Axelrod is a fictional character whose proclivity for conflict is only superceded by his complete lack of restraint when his life and freedom are on the line. In real life, the consequences of your actions are permanent and making deals in the business world often means compromising, negotiating, and settling.

Today (September 29, 2018) Elon Musk settled with the SEC. He will no longer be chairman of Tesla, for at least three years, and will pay a fine in excess of $20 Million. In all, it is a relatively lesser penalty than the lifetime ban from being CEO of a publicly traded company that the SEC was seeking. It is also a larger punishment than someone who has not committed any wrongdoing deserves. Depending on your perspective, Musk either got away easy or was unfairly chastised by the state for a 60 character tweet.

Of course, the civil settlement does not preclude the Justice Department from filing criminal charges against Elon at a future date. However, a criminal trial has a much higher burden of proof than a civil case, which can be decided based on a balance of probabilities. In a criminal case, the prosecution must prove, beyond a reasonable doubt, that the defendant committed the alleged crimes, whereas, in a civil suit, all that is required is a greater than 50% probability that the act took place.

In a previous post from September 27, we discussed whether AI could play a role in predicting the outcome of cases like this, perhaps assisting traders in making appropriate investment decisions surrounding companies with legal troubles. Despite a strong performance in short-term volume trading, automation has not yet played a large role in the fundamental analysis of a stock’s long-term viability. Most AIs that trade today are relying on purely technical analysis, not looking at any of the traits that make a company likely to succeed, but instead relying on historical price data to predict trading and movement patterns.

Fundamental analysis is complex and subjective. Even the smartest deep neural networks would have a difficult time distinguishing between the very human aspects that go into valuing a company. The problem with AI, in this particular application, is that it would require a broad knowledge of various domains to be combined in order to predict with any degree of accuracy. Right now, even the best deep neural networks are still very narrowly defined. They are trained to perform exceptionally well within certain contexts, however, beyond the confines of what they ‘understand’ they are unable to function at even a basic level.

Screenshot 2018-09-29 19.52.57.png
Complexity in neural networks results in ‘overfitting’ – networks specify the training set well but fail at more generalized tasks.

In the above example, we can see how more complicated neural networks might fail to understand topics that are even slightly different from what they have seen in the past. The model fits the data that the network has already encountered, however, this data does not reflect what could happen in the future. When something happens that they haven’t encountered before (a CEO tweets something about 420, for example), a human can immediately put that into context with our everyday experience and understand that he’s likely talking about smoking weed. However, an AI trained to predict share prices based on discounted cash flow analysis would have absolutely no clue what to do with that information.

It is likely that there are companies working on technology to help train neural networks to deal with the idiosyncratic information present in everyday business interactions. One possible answer is to have multiple neural networks working on different subsets of the problem. Similar to how deep neural networks have enabled advances in fields ranging from medical diagnosis to natural language processing, new organizations of these systems could enable the next generation of AI that is able to handle multiple tasks with a high level of competency. As we continue to build this technology, we’ll keep speculating on whether or not an executive is guilty, and traders and short-sellers will continue to make and lose billions based on the result.

Elon Musk Indicted by SEC, Can AI Help?

The big news from the tech and finance world on September 27, 2018, is that Elon Musk has been sued by the US Securities and Exchange Commission (SEC) for his tweets about taking Tesla private at $420 per share. 

The SEC is seeking to have Musk banned from serving as an officer or director of any public company. Their reasoning? Musk was lying about having funding secured. This implies that he was trying to manipulate Tesla’s share price in an upward direction. Well, it worked, for about a day, that is. On the day of the tweet, Tesla’s share price rose to a high of $379.87 US per share from its previous price of around $350 per share, before falling back to $352 the next day (August 8, 2018). If the markets had actually believed Musk’s Tweet, Tesla’s share price likely would have climbed closer and closer to the mythical $420 price as the take-private day neared.

Tesla’s share price peaking after Musk’s announcement.

Instead, Tesla’s share price dropped like a rock because every savvy investor realized that Musk’s statement was either pure fanciful bluster, a joke about weed, or both. Of course, today has been much worse for Tesla’s share price than any of Musk’s recent ill-advised tweets. In after-hours trading, Tesla’s share price is down as much as 13%. That’s a lot and it is falling dangerously close to their 52 week low. This is all especially troubling considering that Tesla is expected to announce their best quarter ever, in terms of cash flow, in a few days.

  • Loading stock data...

So, what is the SEC doing, was it possible to predict this, and could AI make this type of situation any better? The answer to the first question is unclear, however, the answer to the second two questions is likely, yes.

AI is already being used in the legal profession to help identify responsive documents that must be turned over to the opposing party during a lawsuit. MIT Professor Emeritus Frank Levy leads a research that helps law firms apply machine learning software to the practice of law. 

If AI can predict what documents will be useful in a lawsuit, then whenever the CEO of a publicly traded company does something suspicious, it should be possible to use these same programs to parse historical cases and see what precedent there is for a lawsuit to be filed. At the very least, it could provide some insight into the likelihood of an indictment and, in the future, could even suggest potential courses of action for a company to take if it found itself in this type of situation.

Would the AI be able to help predict whether or not Elon will be convicted? Possibly. While I am not aware of any AIs currently being used to predict the outcome of legal matters, in my September 24, 2018 column, I covered the AI that perfectly predicted the outcome of last year’s Superbowl. While legal cases may be more complicated than a football score, there is likely several orders of magnitude more data about the outcome of various lawsuits than there is about football players, simply because there are WAY more lawsuits than there are football teams.

From a financial perspective, we could use this type of AI to predict potential lawsuits and their results and train the AI to make trades based on these predictions. If these types of AI were already in use, we could expect much smoother and more predictable share prices as the effect/implications of a particular news story would become apparent almost immediately after the information surfaces.

For now, I’ve programmed a simple AI for Elon Musk to help him decide if he should tweet something or not. You can try it, too, if you’d like. It’s posted below:


The Best-Worst-Kept Secret in Machine Learning

Neural networks are pretty simple to build. Last time, we picked apart some of the fundamentals and how they really just boil down to linear algebra formulas. There is, however, one single algorithm that is incredibly useful for machine learning and I hadn’t heard of it until today. It’s called Logistic Regression. It’s a five-step process that enables nearly all modern Deep Learning software. Here’s how it goes:

  1. Take your data
  2. Pick a random model
  3. Calculate the error
  4. Minimize the error, and obtain a better model
  5. Become sentient, destroy all humans, dominate universe!

… Okay, I took a little creative licence on that last one. But seriously, it’s that simple. The only complicated part is calculating the ‘error function’ and generalizing it for large and varied datasets.

The error function itself is a big long, somewhat scary formula that looks like this:

Error Function = -1/m \sum_{i=1}^{m} (1-y)ln(1-\alpha(Wx^{i}+b)) + y_i ln(\alpha(Wx^{i}+b))

What the error function is doing though, is really quite simple. The error function is looking at a set of points (usually, pixels of an image). We can represent an image like this:

Dots on a graph (oddly reminiscent of a smartphone or computer display, don’t you think?) 

The job of the logistic regression algorithm is to find a line that divides the red and blue pixels as best as it can. It shifts, translates and iterates, moving the line until it reaches the maximum possible percentage of blue pixels on one side of the line and the maximum possible percentage of red pixels on the other side of the line. As it iterates, it looks something like this:

The Logistic Regression algorithm starts by bisecting the graph at a random location and then it moves the line until it has maximized the number of blue pixels on one side and the number of red pixels on the other side. 

We call this ‘minimizing the error function‘ because what the algorithm is doing is finding the smallest number of blue pixels on the red side and the smallest possible number of red pixels on the blue side. These pixel mismatches are like errors if we’re trying to separate the two.

Here we can see the Error Plot as the algorithm iterates through the various stages and moves the dividing line. We can see that the percent error decreases with the number of epochs (iterations). It will not be able to get to zero in this case because there is no straight line that perfectly divides this particular plot, but it can surely reduce the errors to a minimum.

There, now you know about Logistic Regression, one of the foundational algorithms of machine-learning and deep neural networks. Of course, things start getting much more interesting when we’re no-longer using straight lines to divide the graph and we’re working with full-blown images instead of a few dots on a plot.

Let me know if you’ve learned something by reading this article. Soon we’ll start using these foundational principles and apply them to more complex tasks. Perhaps we’ll even be able to predict the next major market bubble before it bursts. But for now, that’s all!

What AI Does Well

AI has become extremely adept at giving suggestions, sorting through huge volumes of data and providing summaries. Right now, I can log onto Google Photos and type any word I want and Google’s image classification algorithm will find me photos that contain whatever I search for. For example, I’m considering selling my 2013 VW Tiguan in order to help pay for another corporate vehicle (that happens to be a Tesla). Anyways, I typed Tiguan into the search bar on Google Photos to find images of the car that I could post online. Sure enough, every photo that I’ve ever taken of my car popped right up, and some photos showed up that had other people’s Tiguans in the background. I have around ten-thousand photos in my library, so finding those few is quite an impressive feat and would have been much more difficult had I tried to do it manually.

Some of the images Google’s AI found for me when I searched the word Tiguan

Most of the improvements in AI over the last 5-15 years have come from developments in a type of machine learning software called deep neural networks. They’re called neural networks because they form analogous structures to the human brain.

Basically, they’re a huge array of neurons (input neurons, output neurons and hidden neurons) connected by lines that represent weights. The connections between the neurons form matrices that modify the subsequent layers of the neural network. It all looks something like this:

Simplified neural network with only one hidden layer – Courtesy Udacity

Typically, deep neural networks have multiple hidden layers (it’s why they’re called ‘deep’ neural networks). What happens in the hidden layers is obstructed from view and it isn’t always obvious what each of the hidden layers is doing. Generally, the hidden layers are performing a simple matrix operation on the input values, the result, weighted by the lines (scalars) connecting the layers, is eventually passed to the output layer. The goal of an image classifier, for example, is to take an input, let’s say an image of a cat, and then produce an output, the word cat. Pretty simple, right?

Well, it kind of is. As long as you know what the input is and what the output should be it is relatively straightforward to ‘train’ a neural network to understand what weights to assign in order to transform a picture of a cat into the word cat. The problem arises when the network encounters something that it didn’t train for. Of course, if all the network has ever seen are picture of cats, if we feed it an image of something else, say, a mouse, the network might be able to tell you it’s not a cat, if it was trained with enough data, but more likely it will just think it’s a weird looking cat. If the network gets constantly rewarded by identifying everything as a cat, it’s probably going to think something is a cat when it sees it.

A neural network acts like a linear function that divides a boundary, in this case, cat vs not cat. Having a neural network with multiple layers allows the lines that can be drawn to be ‘curvier’ and include more cats and fewer dogs.

This is why having a large enough training and testing datasets is critical for neural networks. Neural networks need to train on large quantities of data. Google has billions (perhaps trillions) of photos stored in their servers, so they’ve been able to train their neural networks to be incredibly efficient at determining what is in an image.

In general, problems where there is a large enough training dataset and both the input and the answer are known for the training set are fairly tractable for AI programs today. One task that is generally more difficult for today’s AI software is explaining how and why it got the answer it did. Luckily, researchers and businesses are hard at work solving this problem. Hopefully soon, Google Photos will be able to not only show us images of all the cats in our photo library, but also be able to tell us why they’re so cute and yet so cold all at the same time.

‘Blackbox’ AI happens when a system can provide the correct answer but gives no indication of how it arrived at the solution.

The True Cost of an MBA

Everything has an opportunity cost. An MBA, for example, costs about fifty to eighty thousand dollars, but that’s just the face value. It turns out, by taking two years off of work to go to school, you are also sacrificing the earnings you could have had from those two years, not to mention any promotions, raises or job experience that would have come along with it. If we’re thinking about lifetime earning potential, we can calculate the incremental earnings that you’d need from the MBA in order to break-even on the investment. Of course, all of these calculations should always be done ex-ante (prior to enrollment) because otherwise, we’re falling prey to the sunk-cost fallacy, and that will only make us regret a decision we’ve already made.

For example, let’s say that your MBA will cost $75,000 up front and that you are currently making $50,000 per year annually at your current job. What incremental salary increase would you need in order to account for the opportunity cost of the MBA?

First, we have to calculate an appropriate discount rate for our money. In this case, we can probably use r_m , the market’s rate of return because if we choose not to put the money towards an MBA, we could instead put it in an Index Fund or another similar investment vehicle, where it would grow at around the market interest rate.

Source: Market-Risk-Premia.com

Based on the July 2018 numbers, the market risk premium is about 5.38%. Notice that we didn’t just use the Implied Market Return of 7.69%, this is because we need to subtract the Risk-free rate r_f in order to account for the incremental risk.

Let’s round down to 5% for simplicity. Assuming we’re starting our MBA in January of 2019 and Finishing in December of 2020 (2 years) with a cash outflow of $37,500 in 2019 and 2020 and sacrificed earnings of $50,000 in each of those years. We can calculate the future value (FV) of that money in 2021 as follows:

Future Value of Annuity Formula
Future Value of an Annuity

Our periodic payment, P , is $87,500, our discount rate,r , is 5% and our number of periods, n , is 2. That leaves us with the following:

FV = \$87,500*[((1+0.05)^2-1)/0.05]  = \$179,375

Assuming we’re able to land a job on day 1 after graduation, how much more do we have to make in our careers to make up for the opportunity cost of the MBA? For that, we can use another annuity formula to calculate the periodic payment required over a given number of years to equal a certain present-value amount.

Annuity payment formula

Let’s say that we will have a 30-year career and that our market risk premium stays the same at 5% (the historical average for Canada is closer to 8%, however, let’s be conservative and stick with 5%). Substituting in these values to our formula with PV = $179,375 r = 5% and n = 30, we find that the payment, P, is:

P = {0.05*\$179,375}/{ 1 - (1+0.05)^{-30}} = \$11,670

So, we need to make an additional $12,000 per year every year for the rest of our careers, because of the MBA, in order to make up for the opportunity cost of the program.

If that seems realistic to you, maybe you should consider an MBA.

Of course, if we’re being really clever, we should probably also include a risk premium for our MBA. There is not a lot of data out there to suggest what the probability of completing an MBA is, but we can assign some probabilities to our equation for reference. Let’s say that there’s a 60% chance that the market will be strong when we complete the MBA and we’re able to find a job that pays $62,000 per year right out of the MBA program. There is also a 20% chance that we’ll make the same amount as we made before the MBA program $50,000 per year, a 10% chance that we’ll make $75,000 per year after the program and a 10% chance that the market for MBAs tanks and we’ll make below $40,000 per year when we graduate.

Expected Value = 0.6 * \$62,000 + 0.2 * \$50,000 + 0.1 * \$75,000 + 0.1 * \$40,000 = \$58,700

How do we make a decision with all these different possible outcomes? Simply multiply the probabilities by the annual salaries and add them together to find the probable result. If these numbers are correct we’re looking at an equivalent salary of $58,700 per year coming out of the MBA program. Of course, these numbers are completely made-up, but if we find numbers like these in our real-world evaluation, the logical decision from a financial perspective would be to reject doing an MBA because the cost is outweighed by the potential gains.

According to PayScale, the average salary in Calgary for an MBA with a finance specialization is $87,500 per year, but the average salary for someone with a bachelor of science degree is over $75,800 per year. Based on these numbers, it might not make sense for someone with a science degree to do an MBA.

Of course, there are other intangible factors that come into play including career preferences, lifestyle, and happiness. These are all important and should definitely be factored into your decision.

Graphs and iPads are an important part of any MBA

Yes, this is a very hard decision to make but can machine learning algorithms help make these decisions easier for us? It should be possible to use machine learning algorithms to predict future earnings potential and even take into account qualitative variables like career preferences and working style to give us a better idea of which choices might be right for us.

It is my goal to understand the capabilities of machine learning models to assist in these types of financial predictions. Hopefully, in the next few weeks, I’ll have an update for you on whether this type of predictive capability exists and if it does, how to access it.

For now, good luck with your decision making! I did an MBA and I don’t regret it at all because it was the right decision for me. My hope is that this article has given you the tools to decide whether the decision might be right for you.

Robots Still Can’t Win at Golf

Tiger Woods won his 80th PGA tour title this Sunday, September 23. I was planning to delve deeper into my MIT course on AI and study the details of natural language processing, specifically syntactic parsing and the value of training data. Instead, I found myself glued to a browser window for four and a half hours this afternoon, watching my favourite golfer relive his glory days, winning by 2 shots and capturing the Tour Championship. It was totally worth every minute.

You see, even though humans are being vastly outpaced by AI and machines at every turn, humans are still better at many nuanced tasks. Sure, you can program a robot to swing a golf club and hit repeatable shots, but even the best golf robots still can’t beat the best humans over 18 holes with all the nuanced shots required for a round. Still, they can make a hole-in-one from time to time:

Despite humanity’s increasing incompetence compared to machines, it is still incredibly fun to watch a talented person, who has worked their entire life to perfect their craft, get out there and show the world what they’ve got. Doubly so if that person has recently recovered from spinal fusion surgery and hasn’t won for over five years on tour. Yeah, it’s just putting a little white ball in a hole, but the crowds and excitement that Tiger Woods is able to generate while he plays are unparalleled in golf, and possibly even in sports.

Tiger didn’t win the FedEx Cup, the PGA Tour’s season-long points-based title, but he came really close. If he had, he would’ve made $10,000,000 on the spot. Not too shabby. Regardless, with the highest viewership numbers in the history of the tournament and crowds so large that commentators said they’d never seen anything like it, Tiger Woods undoubtedly made the tour, its sponsors and network partners well over $10 Million this weekend. The amount of value that he generates for the tour and for golf is almost incalculable.

Of course, if you’re not a golf fan, you probably think that it’s boring to watch. That can be said about just about any sport or event that one doesn’t understand. Something is boring to us because we don’t understand the context, the history, and the implications of a certain event happening. Once we understand the subject and can opine and converse with other people about the topic then it becomes much more real and tangible.

I think the same principle applies to artificial intelligence as well as finance. Few understand the topic. It takes time to learn and understand the nuances that make the topics interesting and valuable. Once one does build the knowledge and expertise to apply skills in these areas, the results can be extraordinary

So I’m going to pose a question for my future self and any would-be AI experts. In 2 years, will we be able to build software that can perfectly predict the outcome of major events in sports, specifically golf tournaments, with better results than the best human statisticians and algorithms?

Before you say pfft and walk away thinking I’m a complete idiot for saying that, already this year, an AI has perfectly predicted the outcome of the Superbowl. Let that sink in.

It’s going to happen. My hope is that I’m the one building that software.

Google is Lightyears Ahead in AI

Google’s Deepmind is incredible. In 2016, AlphaGo (one of the Google Deepmind projects) bested the world champion Go player, Lee Sedol in the 2500-year-old Chinese game that many AI experts thought would not be cracked for at least another decade.

AlphaGo won that match against Sedol 4-1, smashing the belief of experts and Go fanatics around the world who knew that a computer couldn’t yet beat a human. There’s a great Netflix documentary on the feat that chronicles the AlphaGo team’s quest to defeat the grandmasters of the ancient game. It reminded me of when I was younger and I watched IBM’s Deep Blue defeat the Grandmaster, Garry Kasparov at Chess, except this was much scarier.

Deep Blue was able to defeat the best Chess players in the world because chess is a game with a relatively small number of possible moves each turn and a computer can essentially ‘brute force’ calculate all the potential moves that a player could make, and easily counter them. Go is a very different game. With an average of 200 potential moves every turn, the number of possible configurations of a Go board quickly exceeds the number of atoms in the entire universe. There are approximately 10^{78} atoms in the universe. A game of Go can last up to 2.08*10^{170} moves, and at each point has at most 361 legal moves. From the lower bound, the number of possible go games is at least {10^{10}}^{48} , which is much, much larger than the number of atoms in the observable universe. This means that brute-forcing the outcome of a Go match is essentially impossible, even with the most powerful supercomputers in the world at your disposal.

In order to surmount this problem, AlphaGo combines three cutting-edge techniques to predict outcomes and play the game better than any human. Alpha Go combines two neural networks – one conducting deep reinforcement learning and the other using a value network to predict the probability of a favourable outcome, with a Monte Carlo search algorithm that works similarly to how Deep Blue worked. The deep reinforcement learning piece works by having the system play itself over and over again improving the neural network and optimizing the system to win the game by any margin necessary. The value network was then trained on 30 million game positions that the system experienced while playing itself to predict the probability of a positive outcome. Finally, the tree search procedure uses an evaluation function to give a ‘tree’ of possible game moves and select the best one based on the likely outcomes.

The architecture of the system is impressive advanced and its ability to beat humans is impressive, however, the most amazing moment in the Netflix documentary comes when the world champion, Sedol, realizes that Alpha Go is making moves that are so incredible, so creative, that he has never seen a human player even think of playing those moves.

Since beating the 9-dan professional two years ago, Google has since iterated its AlphaGo platform with 2 new generations. The latest generation AlphaGo Zero, beat the iteration of AlphaGo that defeated humanity’s best Go player by a margin of 100 – 0. That’s right, the newest version of AlphaGo destroyed the version of AlphaGo that beat the best human player 100 times over. The craziest thing is, the new version of AlphaGo was not given any directions except for the basic rules of the game. It simply played itself over and over again, millions of times, until it knew how to play better than any human or machine ever created.

Courtesy Google’s DeepMind Blog

This awesome video by Two Minute Papers talks about how Google’s Deep mind has iterated over the past few years and how AlphaGo is now exponentially better than the smartest human players and is trained and runs on hardware equivalent to an average laptop.

Courtesy Google’s DeepMind Blog

It is scary and incredible how fast DeepMind is advancing in multiple fields. Right now it is still considered a narrow AI (i.e. it doesn’t perform or learn well in areas outside of its primary domain), however, the same algorithms are being applied to best the greatest humans in every area from medicine to video gaming. In my opinion, it is only a matter of a few years before this system will be able to beat humans at nearly everything that we think we do well right now. We can only hope that Google built in the appropriate safeguards to protect us from its own software.

If you want to learn more about how our future robot overlords think, there’s no better way to get started than by racing an autonomous robocar around a track. Come check out @DIYRobocarsCanada on Instagram and join us on our Meetup group to get involved today!

Natural Language Processing – I’m Lookin’ at you, Siri

According to some experts, in 2018, natural language and audio processing (NLP for short) are the two areas that AI excels most at. It seems fairly obvious that machine audio processing would be good when you look at the number of US homes with a smart speaker device like Amazon’s Echo, Apple Home Pod, or Google Home (nearly 50%). That’s a lot of people with a device whose sole purpose is to collect data on what you’re saying and translate it into usable commands for you, and into monetizable data for the company that builds them. 

“Hey, Siri!”

— me, just now

In fact, if you read that out loud right now, there is a good chance that your phone just pinged at you wondering what you want from it. With so many devices listening constantly to our every conversation and vocal machination, is it any wonder that they’re starting to understand us better than we understand ourselves?

In my home, I have Apple Home, and by proxy, Siri, set up to control everything from my locks to my lightssmoke alarmsecurity system and thermostat. If Siri ever became sentient and went Skynet on my ass, she could do A LOT of damage. Although, based on the latest tests, I’m much less worried about Siri taking over than I am about Google. Or more specifically Google’s Duplex AI. It can literally trick humans working at shops into believing that it is a person. If that’s not scarily impressive, I don’t know what is.

Are we doomed? Probably not… Yet. Duplex is still very limited according to Google. All it can do is book appointments for you. But people got so scared that they’d be talking to a machine and not know it that the outcry forced Google to hamper its own system. Google has announced that Duplex will now announce itself as a computer at the beginning of every phone conversation so as not to creep people out. To me, this kind of defeats the purpose of the software to begin with. But hey, it’s probably bad form to test your AI on a bunch of unsuspecting hair salon and restaurant employees without them knowing about it.

This software is getting really good at understanding and communicating with humans via voice. This is mainly because it has a TON of really good data collected by millions of devices around the world. It’s also because engineers, mathematicians, and programmers have made some serious breakthroughs in Deep Learning in the past 10 years and Apple and Google have tens of billions of dollars invested into making it work.

This weekend I’ll be learning from MIT Professor Regina Barzilay about what it means for machines to understand something. We’ll also be covering which NLP problems have been solved, where progress is being made and tasks that are still very difficult for computers to solve. Hopefully, I’ll come back with a better understanding of how it all works (I think it has something to do with phonemes and triphones) and what we can do with it. Once I do, I’ll report back here and let you know how close we are to the robopocalypse.  

Matrix Madness

Linear algebra is an important tool used in modern deep learning algorithms. Unfortunately, when I did my undergrad in Electrical and Computer Engineering, I had no idea that the ability to transform vectors and matrices would ever be practicably useful for anything (other than giving me migraines at 2AM the night before my midterm exams). So, once I had learned enough to pass the course, I immediately forgot everything.

It was only when I decided to pursue a deeper understanding of machine learning and AI, in order to apply it to my business and to my work in finance, that it dawned on me. I should have paid attention in Linear Algebra II when I was back at the UofA in Engineering school! Well, since I didn’t, and even if I did, all my books on the subject mysteriously burned up in the great notes fire of ’09, I guess it’s time to re-learn me some matrix math.

Lucky for me, Udacity has brought on some of the best professional educators in the world for their AI Nanodegree program including former Khan Academy animator and 3 Blue 1 Brown creator, Grant Sanderson.

So, now I’m super good at manipulating matrices thanks to the magic of a YouTube superstar and the LaTeX plugin for WordPress websites.

A = \begin{bmatrix} a_{11} & \cdots & a_{1j} & \cdots & a_{1n} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ a_{i1} & \cdots & a_{ij} & \cdots & a_{in} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ a_{m1} & \cdots & a_{mj} & \cdots & a_{mn} \end{bmatrix}

…I spent 30 minutes trying to get this matrix to display correctly.

If you want to learn why vectors are cool and how to use matrix multiplication to rule the world, watch the video series that I’ve linked below.

We haven’t arrived at the part about why Linear Algebra is so important for creating neural networks and deep learning algorithms, but we will. If you’re still with me, keep plugging along, learn how to understand all matrix transformations using only the unit vector and a 2×2 matrix. Eventually, we’ll discover how to program a computer to predict Apple’s  share price the day after they launch a ‘new’ iPhone

/insert corny iPhone XS Max joke here/.

  • Loading stock data...

That’s all for today. Now go out on your own and learn the basics of linear algebra! If you message me directly, I’ll even send you my notes. Tomorrow, we’ll talk about why this stuff is important for machine learning.

Linear algebra joke: One year for halloween I was ‘Snakes on a Plane’

If you’re still here and you’re wondering why yesterday we were talking about valuing a company and today we’re talking about undergraduate linear algebra, you’re probably not alone. So I’ll tell you why: It’s going to take a foundational understanding of programming, mathematics and finance to get where we need to go. To understand machine learning, we have to understand how the software is built and to build software that is capable of doing what a CFA can do, we’ll need to know what a CFA knows. I’m bringing you along on this journey as I learn the fundamentals on both ends, machine learning and finance. Let’s see how it goes!

Do you get it now?

How to Value a Business (Or a Project)

The best-kept secret of financial professionals is that it’s actually pretty easy to value a company, that is, decide how much you should be willing to pay for the business or its shares. My goal is to automate this process using machine learning algorithms to select the appropriate data and apply the formulas in the correct manner. This level of sophistication is still a few months (years?) away, at least by my skillset. For now, we’re going to cover the basics of project valuation via the discounted cash flow (DCF) methodology. Later on, we’ll see if we can get a computer to do the calculations for us.

Note: I’ll be using the terms company and project interchangeably here. However, for companies in more than one industry or market segment, you’ll need to use multiple discount rates because the Beta (systematic risk of the segment divided by the market risk) will vary depending on the industry.

You can probably find a lot of the information that I’m about to disclose (or all of it) in an introductory finance textbook or even from a free resource like Investopedia. That’s fine. Lots of people do not choose to read textbooks or financial-wiki sites in their free time, so I’m going to go over the basics here if you’re interested in the subject, but not quite interested enough to open a book.

Here we go. Are you ready?

All that is needed to value a company is:
    1.  Some revenue projections,
    2.  Some cost projections,
    3.  An appropriate discount rate (or cost of capital) for the company

That’s it!

Obviously, these things can be easy or very difficult to come by depending on several factors including the type of company (or project), the stability of the market, and the quality of the information available about the business.

Let’s assume //quite a big assumption, but hey, that’s what we’re going to do right now// that you’re able to come up with some reasonable revenue and cost projections for the business that you want to value and that you’re able to calculate an appropriate WACC (Weighted Average Cost of Capital) or discount rate.

Then what do you do?

Basically, you take the company’s projected revenue over a given period (let’s say every year for 5 years), subtract the cash costs on the business in each year and you’ve got the company’s Free Cash Flows (we’re skipping a few steps here like subtracting taxes, adding back depreciation, and subtrac

ting Capital Expenditures (CapEx) and changes in Net Working Capital, but we’ll save those for later).

Here’s an example of a company with some projected revenue and some projected costs going out 5 years:

Year 0 1 2 3 4 5
Revenue $20,000 $20,000 $20,000 $20,000 $20,000
Costs ($50,000) ($5,000) ($5,000) ($5,000) ($5,000) ($5,000)
Cash Flows ($50,000) $15,000 $15,000 $15,000 $15,000 $15,000

Next, we take the free cash flows that we calculated above, and we discount each of them by an appropriate ‘discount factor’ that we calculate using our discount rate.

Where: r is the discount rate and n is the period (or year)

All of my finance professors are about to roll over in their beds right now (they’re not dead), but let’s say the discount rate that we found for the company is 10%. Here’s what we end up with for the discount factor over the 5-year period.

Year 0 1 2 3 4 5
Revenue $20,000 $20,000 $20,000 $20,000 $20,000
Costs ($50,000) ($5,000) ($5,000) ($5,000) ($5,000) ($5,000)
Cash Flows ($50,000) $15,000 $15,000 $15,000 $15,000 $15,000
Discount Factor         1.00        0.91        0.83        0.75        0.68        0.62
Discount Rate 10%

Now we just multiply our free cash flows by the discount factor for each year to get the present value (PV) of the future cash flows. Once we have the PV of the cash flows, we can add them all together to find out what the project is worth to us, also known as the project’s NPV or Net Present Value.

Year 0 1 2 3 4 5
Revenue $20,000 $20,000 $20,000 $20,000 $20,000
Costs ($50,000) ($5,000) ($5,000) ($5,000) ($5,000) ($5,000)
Cash Flows ($50,000) $15,000 $15,000 $15,000 $15,000 $15,000
Discount Factor               1.00              0.91              0.83              0.75              0.68            0.62
PV Cash Flows ($50,000) $13,636 $12,397 $11,270 $10,245 $9,314
Project NPV $6,862
Discount Rate 10%

If you want a primer on what present value means, and what the time-value of money represents, here’s a good video on it from Khan Academy:

That’s it! We’ve valued a business. We now know that if this company was only going to operate for five years, and then cease to exist, that it would be worth about $6,800 to us in our pocket today.

In general, we accept projects that have a positive NPV and reject projects that have a negative NPV. I’ll cover the reasons for this in another post down the line. For now, at least, we are able to value a company given only its revenue, costs, and an appropriate discount rate. Things are going to get a lot more complicated from here so enjoy the simplicity while it lasts.