I am a management consultant exploring the World of Artificial intelligence.

How an online class in machine learning prepares you for the real world

How an online class in machine learning prepares you for the real world

Note: This article also appears on LinkedIn.

I’ve recently graduated from Udacity after completing an online class in Artificial Intelligence. Udacity is an online school that offers various online courses with a focus on tech. Longer online classes are called Nanodegrees. We coded up all kinds of intelligent software for months and ended with several variants of neural networks and various other deep learning algorithms and intelligent programs. I thought implementing my own projects would now be easy. Well, who would’ve expected that: I was wrong. Let me explain.

I have been around computers for almost all my life. If I remember it right, I started programming somewhere between the age of 5-10. My Dad once majored in Computer Science and found it important to a) give me a computer as soon as I wouldn’t put the keyboard in my mouth anymore and b) show me what one could do with programing languages. I learned Basic, then Delphi. Lots more during University and even more until today. I have to admit, I never became an expert in any of those. Still, I know my way around IDEs, compilers, debuggers, disassemblers at whatnot. It was always enough to implement what I wanted to do. Google, StackExchange, etc. did the trick for most projects.

StackExchange - saving lazy coder’s butts since 2008! (Image Credit: By Stack Exchange, Public Domain, Wikimedia Commons)

StackExchange - saving lazy coder’s butts since 2008! (Image Credit: By Stack Exchange, Public Domain, Wikimedia Commons)

I decided to go for the Artificial Intelligence Nanodegree since I was simply curious. I've played around a little with neural networks now and then since 2003 and was interested to see what all the current fuzz is about. Additionally, I wanted to dive deeper in the whole field. The second term was almost all about NNs. There were the simple MLP ones, convolutional ones, recurrent ones and other exotic types and variants. The projects are organized and presented in jupyter notebooks, which makes it easy for you to get coding and playing. You get to understand the basic concepts of the network types, what kind of problems they solve and how to work with them.

Once I completed the Nanodegree, I was eager to start my own projects. I find Recurrent Neural Networks (RNNs) especially interesting and started to play around with them. And that’s when I realized I’m still far away from even making simple things.

Must make artificial intelligence! (Image Credit: Shutterstock / Zemler)

Must make artificial intelligence! (Image Credit: Shutterstock / Zemler)

Things suddenly break. And you have no idea why.

In „traditional“ programming (let’s call it that for a lack of a better word), it usually means you’ve coded up something the wrong way. Although software can get as complex as you can imagine, in the end, you have a simple pipe the sorts of „input-> compute-> output“. If the output part is wrong, you've made a mistake in the „compute“ part (or your data was wrong in the first place). The way to take care of that is to „debug“: You try to retrace the specific faulty output all the way back through the program. Sometimes you start with the input and see where this leads you. Basically you follow the data until you find the specific place in your program where things go wrong.

Why is this important? Because with machine learning (and neural networks in particular), this approach gets you exactly nowhere.

(Image Credit: https://xkcd.com/1838/)

(Image Credit: https://xkcd.com/1838/)

If things break during traditional programing and then suddenly are ok again, you know there’s something wrong. In most cases, there’s a lot more wrong than you immediately realize. You follow the trail, you debug and then you notice that one FOR-loop that should be outside that other FOR-loop. Boom, problem solved.

In machine learning, things aren’t that easy. Sometimes I train a neural network on a small dataset and everything looks good. Gradients dissolving, output reasonable. Those things. Then I train it on a big dataset. Fine again. And then I use a medium-sized dataset and out comes garbage. Why? I have zero idea. And if you now start to fiddle with parameters, like the amount of neurons in a certain layer, nothing will ever work anymore afterwards.

There is this very very helpful blog post by Slav Ivanov about why your machine learning algorithm fails (and here is another one. Welcome.). It lists 37 reasons. Let me repeat that: THIRTY-SEVEN!

Compared to the way I debug normal software, trying to understand what goes wrong in a machine learning algorithm is very different. Things break at places and in ways you are not used to. For example I coded up an RNN with training data I scraped of websites by hand, copy pasting huge chunks of text while watching TV. The network performed ok. Then I changed the network and also discovered I can use automatic scrapers to get more training data. My expectation was for things to improve significantly. But, since we’re halfway through this article, you can now guess what happened: out came almost only blank spaces. I tried to figure out for a whole weekend what I did wrong with the new network structure. It wasn’t until a lot later and after hours of cursing and waiting that I discovered that the automatic scraper added loads of white spaces into the training data - almost 20%. Of course the network would spit out lots of blank texts.

You cannot follow the data through the program. Or at least I haven’t found a way to do that yet. The only things you can do are to check whether your data loads right, then dumb down your model as much as possible and use a very small amount of training data. Does it work now? Good - now add back piece by piece what you think you need and see where it breaks. Maybe this might change once I get more experienced, but for now, it is frustrating. 

It takes ages for you to find out whether things work.

Pedros Domingos said it quite eloquently in his book „The Master Algorithm“:

"Machine Learning is like farming. Your seeds are the machine learning algorithms you apply, your soil is the data and programs are your grown crops.“

„And this is where we keep the finished… Oh, never mind.“ (Image Credit: Shutterstock / Dale A Stork)

„And this is where we keep the finished… Oh, never mind.“ (Image Credit: Shutterstock / Dale A Stork)

But as with crops, it takes time for the results to appear. I operate a large and GPU-enabled Amazon EC2 instance for most of the heavy lifting. That beast still needs the night to finish some of my medium datasets in a simple model. So in combination with the upper point, it might take you at least an hour or so to find out whether your newest idea to solve a problem works. It usually does not.

You gonna need a lot of data. And I mean A LOT. 

Turns out, acquiring data for your personal use and without spending any money on it, is almost impossible. Especially, if you want it to be formatted in a way you can work with. For RNNs for example, you need a lot of text input. One idea I had to get a large enough dataset were the protocols of the German Bundestag (https://www.bundestag.de/protokolle). Well, nice try, those are in PDF and split into two columns. No way to parse that into something machine-readable without spending a lot of time on it. This is the reason why companies like Amazon or Facebook are so far ahead of other industries: They have tons of data. Google even makes you help train its machine learning algorithms. Ever noticed those new CAPTCHAs that ask you to select all storefronts in a set of images? Yeah, you’ve just given Google a new training data set. There are some reasons people call data the „new oil“ and this is one of them. Without data, no machine learning. 

My conclusion so far.

After all, I’m working through these problems and yes, I’m getting somewhere. For example, I've learned my lesson about automatic scrapers. And I now understand a lot better why an RNN would put out those errors like it did when it read a lot of blank spaces. I’m also under the impression that I’m developing some kind of „gut feeling“ for neuron numbers, etc.

But „flying solo“ is hard, once again. Was the Nanodegree worth it? Absolutely. And I will do another one. My experience with this fascinating new world is growing, but I have a lot more work in front of me than I realized. However, I now understand a lot better why companies pay machine learning experts so much money: This experience is not easy to get and I now know I have merely entered the path instead of making good progress on it. So here is my shoutout to all you other beginners, learners and students: The journey is what we’re here for - and I heard it's worth it!

How to load images into an ndarray as input for a CNN