Open AI: a beginner’s guide

AI is hot, and it’s real. There’s no doubt. And when it comes to openness in AI, we are lucky. Thanks to Berkeley University, Google, and Facebook, we have great libraries one can use to form convolutional neural networks that will solve the world’s biggest problems; like “hot dog or not” 😃

Joke aside, this article is intended to give you a quick introduction to the hottest convolutional neural network frameworks available. For those of you who don’t know, convolutional neural networks are brain replicas in software form, literally. They were invented by Yann LeCun at NY University -now Director of AI Research at Facebook- by examining the biological structure of cat brains and replicating them in software.

It turns out the mechanics of our brains have something in common with numerical limits. In mathematics, the shape of a polynomial function such as (x4 + x3) is defined by its highest exponent only.

Similarly, you may remember from L’Hopital’s Rule in Calculus that in limits with undefined results, the only exponent that matters is the largest ones.

The theory behind convolutional neural networks is similar. What happens is, you divide your inputs into smaller parts, apply a series of random mathematical calculations on your data, then regroup them in smaller chunks, continue this operation a number of times (depending on your needs) and you end up with a label descriptor, that lets you classify this input as either a hot dog or not, or something more useful. In other words, similarly to highest exponents with polynomial functions, the boldest wins.

Ok, this may not be the most academically sound explanation of what a convolutional neural network is, but it is an intro. The 20-mins video below dives deeper:

If it’s still not crystal clear for you, you’re not alone, and  Feynman’s words on quantum mechanics are entirely applicable in this context too;  “If you think you understand convolutional neural networks then you don’t understand convolutional neural networks”

Without further ado, let’s begin with some practical frameworks that you can use today to get started with AI:

For starters, TensorFlow is Google’s deep learning framework. It is well-documented, has the backing of a very large community as well as Google, and some of the best and brightest minds in AI are working on it. At one point they even hired Yangqing Jia who is the founder of Caffe, which is one of the main frameworks on our list. TensorFlow is available in Python (most stable), Java, Go and C++.
Caffe: (recommended)
Caffe, born at Berkeley University, is the library that’s been around for the longest time and that has the most academic backing. Most scientific papers run on Caffe examples, but this has started to change after Google’s introduction of TensorFlow. Caffe has a successor Caffe2, which Facebook is sponsoring. Caffe’s founder Yangqing Jia has moved to Facebook in February this year, to get back  to work on his brainchild. Caffe is written in C++ but has interfaces in Python and Matlab as well.
Theano with Lasagne is a breeze to program and run convolutional neural networks. Its Python-based interface is the easiest to use and get started with, but it lacks the same level of  academic and corporate backing.

These are just some of the most popular ones. Facebook at one point used Torch heavily as well, but nowadays they seem to be investing in Caffe2 more than anything else.

Also, those of you who just want to play with well-known networks without much insight, Nvidia has a great web-based user interface that one can use to train via public-domain networks such as AlexNet and Googlenet, called DIGITS. It is free.

Last but not least, for a successful convolutional neural network, you need massive, high-quality datasets during the training. Just remember how many years it takes for a baby to recognize the most basic shapes, like a circle, triangle, and rectangle. The noise factor aside, we are talking about the most advanced neural network here. Similarly, you’ll need to train your baby convolutional neural networks with high-quality datasets to fit them properly. When it comes to open data, your options are not as abundant, but we do have some. For image processing, you can take a look at ImageNet. For natural language (or text) processing based applications, n-grams might be a good start.

The choice is yours. Let the hack begin. This article is not a tutorial, so we won’t show you how to use these frameworks. If you have any preference over one or the other, please let us know in your tweets in response to this article. More articles on the topic of AI are coming soon…


Leave a Reply