Resources for getting started with ML and DL

January 3, 2018

The following is a collection of resources that I found useful when I started to learn machine learning and deep learning. In this post I’m using the term machine learning to refer to classical machine learning and the term deep learning to refer to machine learning with deep neural networks. There are numerous other good resources out there which are not mentioned here. This doesn’t mean I consider them as inferior, it’s just that I haven’t used them and therefore can’t comment on.

First steps

If you are completely new to machine learning I recommend starting with the outstanding Stanford Machine Learning course by Andrew Ng. It is easy to follow and covers topics that every machine learning engineer really should know. The course uses Octave (an open source alternative to MATLAB) for programming. Algorithms are implemented from scratch in order to get a better understanding how they work. The course is also a good preparation for the Deep Learning specialization at Coursera.

After having taken the course I felt the need to learn Python and re-implement the exercises with scikit-learn. Scikit-learn is a Python machine learning library that provides optimized and easy-to-use implementations for all algorithms presented in the course. I published the results as machine-learning-notebooks project on GitHub.

If you are new to Python, the Python tutorial is a great resource to start with. I also recommend to work at least through the NumPy tutorial, SciPy tutorial, Pandas tutorial and Pyplot tutorial before starting with the scikit-learn tutorials. After having worked through these tutorials you should be prepared for implementing the algorithms presented in the course with scikit-learn.

Further courses

  • Deep Learning specialization. This specialization consists of five courses, tought by Andrew Ng, covering deep neural network basics, regularization and optimization and models for computer vision and sequences (text, speech, …). If you enjoyed the quality and accessibility of Andrew’s Machine Learning course you will probably like this course too. It provides you with the skills needed to follow more advanced literature in that field, including research papers.

    The initial programming exercises for the basics are in plain Python/numpy to get a better understanding how forward and backward propagation work. Models for computer vision are implemented with Tensorflow and Keras. Many examples cover recent research literature from 2014 or newer (ResNet, GoogLeNet, FaceNet, … and many more). The last course on sequence models wasn’t available yet at the time of writing this post.

A good understanding of statistical inference basics is important to get more out of the machine learning, deep learning and statistics literature listed further below. If you need a refresher on statistical inference basics then the following courses might be helpful:

  • Inferential statistics. This course covers the basics of inference for numerical and categorical data, hypothesis testing and statistical tests such as ANOVA and Chi-squared. It follows the frequentist approach to statistical inference and is part of the Statistics with R specialization. The course content (except R basics) is also covered by the freely available book OpenIntro Statistics.

  • Bayesian statistics. Many advanced machine learning and deep learning techniques are based on Bayesian inference. The course teaches the basics (Bayes’ rule, conjugate models, Bayesian inference on discrete and continuous data, …) and compares them to the frequentist approach. Other basics such as Markov Chain Monte Carlo (MCMC) and hierarchical models are not covered though. A good companion to this course is the book Doing Bayesian data analysis (see below). Before taking this course, familiarity with the frequentist approach is helpful.

Books

  • Machine learning - a probabilistic perspective. A comprehensive book on classical machine learning techniques. Its focus is rather theoretical and the descriptions are math-heavy. All concepts are explained in an excellent way and therefore rather easy to follow even for machine learning beginners, given basic familiarity with multivariate calculus, probability and linear algebra. The book covers both the frequentist and Bayesian approach to inferring parameters of statistical models. Code examples are in MATLAB but there is also a Python port available.

  • Deep learning. A comprehensive book on deep learning techniques. Part 1 covers machine learning basics. Part 2 covers deep neural network basics, convolutional neural networks (CNNs) and recurrent neural networks (RNNs). The content is comparable to that of the Deep Learning specialization but is presented in a more academic way. Part 3 covers more advanced topics such as auto-encoders, representation learning and deep generative models. This is not a book for a practitioners but one of the best deep learning overview books I’ve seen.

  • Hands-on machine learning with scikit-learn and Tensorflow. If you’ve already taken a first machine learning and deep learning course, this book is for you. It is packed with useful code examples and guidelines for real-world machine learning projects. Part 1 focuses on the implementation of classical machine learning models with scikit-learn. Part 2 focuses on deep learning with Tensorflow. In addition to CNNs and RNNs this part also has chapters on auto-encoders and reinforcement learning. Both, theory and code examples are presented in a clear and concise way.

  • Deep learning with Python. Another excellent deep learning book for practitioners with code examples using Keras. Keras is a deep learning framework with a higher-level API than Tensorflow that aims to enable rapid prototyping. In addition to a detailed coverage of CNNs and RNNs this book also has chapters on advanced deep learning best practices and generative deep learning. It is a good complement to part 2 of the previous books (from a tools perspective). If you are not sure which one is better to start with, I recommend this one as first steps are easier with Keras than with Tensorflow in my opinion.

  • Introduction to statistical learning. If Machine learning - a probabilistic perspective is too math-heavy for you, this book is a good alternative. It covers statistical machine learning basics with a minimum of maths and approaches it from a frequentist inference perspective. It is also an excellent introduction to R. If you want to go deeper after having read this book, both in terms of math and number of approaches, I recommend The elements of statistical learning. Both books are also freely available as PDF (ISL, ESL).

  • Doing Bayesian data analysis. An excellent introduction to Bayesian statistics that prepares you well for reading more advanced literature in that field. It covers the Bayesian analogues to traditional statistical tests (t, ANOVA, Chi-squared, …) and to multiple linear and logistic regression among many others. It requires only a basic knowledge of calculus. For me, the book was a helpful companion to the books Machine learning - a probabilistic perspective and Deep learning. Code examples are written in R using packages JAGS and Stan for MCMC sampling. There’s also a Python port available using PyMC3.

  • Data Science from Scratch. This book is about data science in its most distilled form. Don’t expect too much depth here but a great overview of data science topics such as probability and statistics, data preparation and machine learning basics. The book focuses on understanding fundamental data science tools by implementing them in plain Python from scratch. Well-known statistical and machine learning libraries are not used here but each chapter contains references to libraries you should actually use for your own projects and links for further reading.

There is also a large number of useful blogs and survey papers about machine learning which I’ll leave for a separate post. I nevertheless hope you find this a useful guide for getting started with machine learning.



comments powered by Disqus