A lot of people ask me how to get started with deep learning. In this post I’ve listed a few resources I recommend for getting started. I’ve only chosen a few because I’ve found precise recommendations to be more helpful. Let me know if you have any comments or suggestions!
Prelude: If you’re new to machine learning
Deep learning is a kind of machine learning, so it’s best if you have some familiarity with machine learning first.
1. Udacity’s Intro to Machine Learning course gives a good big-picture overview of the machine learning process and key algorithms, as well as how to implement these processes in Python with sklearn.
- It’s fun and is easy to get through (mostly short videos with interactive quizzes) – I recommend it especially if you find it hard to motivate yourself to read guides. 🙂
2. Machine Learning Mastery has lots of fantastic step-by-step guides. I review this resource in more depth below.
A. Resources with a Practical Emphasis
I put this first because most people seem most keen on building and using models. I also find the theory easier to grasp and more interesting once you’ve played with implementations.
This is my #1 pick (okay maybe tied top pick) for people who want to learn to machine learning. It’s also a great resource if you’re looking to solve a specific problem – you might find something you can pretty much lift because of the results-first approach Jason takes.
Strengths: His guides are results-oriented, go step-by-step and he provides all the code he uses. He’s also responsive to emails.
Jason’s LSTM e-book is also excellent – it discusses which parameters are important to tune, and what architectures and parameter settings usually work for different problems.
Topics: Neural networks, convolutional neural networks, recurrent neural networks (including a focus on LSTMs), deep learning for natural language processing (NLP), general machine learning
Tools: Mainly Keras (wrapper for TensorFlow).
Note: I’d advise you to supplement this with CS231n (below) or other resources with diagrams (e.g. in this post) when learning about CNNs. It’ll help your intuition.
B. Resources with an Emphasis on Theory
Here are some resources that have a greater emphasis on theory. Learning theory helps you understand models, and so helps you build architectures and choose parameter settings that are more likely to work well. It also helps with debugging, e.g. by knowing gradient descent is likely to fail in your use case.
This is a bridge between theory and practice, and is either tied #1 or #2 on my list. It covers much more theory than Jason’s tutorials but has fewer ‘real-world’ use cases, so the two complement each other well. The code is usually in raw Python (as opposed to e.g. TensorFlow) because the emphasis is on understanding the building blocks.
Strengths: The explanations of concepts are intuitive and (relevantly) detailed and the visualisations are fantastic.In particular, you will learn what the optimisation methods actually are. They give great tips on what to watch out for when building or training models too.
(The only reason this isn’t ‘the #1 resource’ is because most people who ask me are looking to get started fast, and you can get results much faster using Jason’s tutorials. But the quality of explanations and the understanding you get here is top-notch.)
Note: Online lectures may be available on YouTube (they seem to have been taken down at time of writing).
Topics: Neural networks, convolutional neural networks, tutorials on tools you’ll be using (Python/Numpy, AWS, Google Cloud).
Tools: Python with Numpy.
You’ve probably heard of this one. It’s a book written by top researchers Goodfellow, Yoshua Bengio and Aaron Courville. The HTML content is available for free online.
Of the resources so far, this is definitely the most theory-heavy, with only some pseudocode. It does contain an entire chapter on practical deep learning as well as advice scattered throughout. The chapter covers how to select hyperparameters, whether you should gather more data, debugging strategies and more.
Strengths: It is beautiful and gives detailed and intuitive theoretical exposition (much of which is mindblowing, all of which I’ve found interesting) on many topics. It also discusses foundations in information theory that you might not be aware of.
If you’ve done some deep learning in practice and like maths, you might really enjoy this. It is harder to get through than the resources above (don’t expect to read through it chronologically in one go) but it could really add to your understanding.
Note: If you are only looking to casually implement models, I don’t think you need to read this book.
Topics: Neural Networks (NNs), Convolutional NNs, Recurrent NNs, recursive neural networks. It also goes into more advanced areas that the previous resources didn’t go into, such as autoencoders, graphical models, deep generative models (obviously) and representation learning.
Tools: Your brain. Haha.
These can be very helpful when you’re looking for something specific. I wouldn’t recommend using them as primary learning resources though.
Denny gives short 1-2 sentence descriptions of terms from backprop to Adam (types of optimizers) to CNNs (architectures). It’s a nice alternative to Googling when you don’t know what a key word means (and ending up Googling ten terms because the wiki definitions use terms you don’t understand). There are also links to relevant papers or resources for most terms.
C2. Code Examples
Examples of code are useful because you can adapt them for your own applications.
This is a collection of things implemented in TensorFlow. The Neural Networks section will likely be of most interest to you.
The one downside is that it’s not always obvious what each argument corresponds to (since it’s just code vs a full-blown tutorial.) So I’ve written two posts based on his code for multilayer perceptrons and convolutional neural networks that explain the code in more detail.
Edit: Aymeric has recently converted his examples into iPython notebooks and added more explanations.
Aymeric is the author of tflearn, a TensorFlow wrapper like Keras.
Topics: Simple examples for MLPs, CNNs, RNNs, GANs, autoencoders.
These are Jupyter notebooks with implementations of CNNs, RNNs and GANs (Generative Adversarial Networks).
The notebooks start with an introduction of what the network is before launching into a step-by-step walkthrough with code and discussion. You can clone Adit’s GitHub repository and run the code on your own computer.
Topics: CNNs, RNNs, GANs. There are also interesting examples like sentiment analysis with LSTMs.
Hope this has been helpful! I have also been building a deep learning map with paper summaries – the idea is to help people with limited experience understand what models are or what terms mean and to see how concepts connect with each other. It’s still very much a work in progress, but do check it out if you’re interested.
I will also likely post an even shorter list of resources for deep reinforcement learning soon – let me know in the comments if you’re interested.