The following is a collection of resources that I found useful when I started to learn machine learning and deep learning. In this post I’m using the term machine learning to refer to classical machine learning and the term deep learning to refer to machine learning with deep neural networks. There are numerous other good resources out there which are not mentioned here. This doesn’t mean that I don’t like them, it’s just that I haven’t used them and therefore can’t comment on them.
First steps
If you are completely new to machine learning I recommend starting with the outstanding Stanford Machine Learning course by Andrew Ng. It is easy to follow and covers topics that every machine learning engineer really should know. The course uses Octave (an open source alternative to MATLAB) for programming. Algorithms are implemented from scratch in order to get a better understanding of how they work. The course is also a good preparation for Coursera’s Deep Learning specialization.
After having taken the course I felt the need to learn Python and reimplement the exercises with scikitlearn. Scikitlearn is a Python machine learning library that provides optimized and easytouse implementations for all algorithms presented in the course (and much more). I published the results as machinelearningnotebooks project on GitHub.
If you are new to Python, the Python tutorial is a good resource to start with. I also recommend working at least through the NumPy tutorial, SciPy tutorial, Pandas tutorial and Pyplot tutorial before starting with the scikitlearn tutorials. After having worked through these tutorials you should be prepared for implementing the algorithms presented in the course with scikitlearn.
Further courses

Deep Learning specialization. This specialization consists of five courses, tought by Andrew Ng, covering deep neural network basics, regularization and optimization as well as models for computer vision and sequences (text, speech, …). If you enjoyed the quality and accessibility of Andrew’s Machine Learning course you will probably like this course too. It provides you with the skills needed to follow more advanced deep learning literature, including research papers.
The initial programming exercises are in plain Python/NumPy to get a better understanding of how forward and backward propagation work. Models for computer vision are implemented with Tensorflow and Keras. Many programming exercises are based on recent research literature from 2014 or newer (ResNet, GoogLeNet, FaceNet, … and many more). The last course on sequence models wasn’t available yet at the time of writing this post.
A good understanding of statistical inference basics is important to get more out of the machine learning, deep learning and statistics books listed in the next section. If you need a refresher on statistical inference basics then the following courses might be helpful:

Inferential statistics. This course covers the basics of inference for numerical and categorical data, hypothesis testing and statistical tests such as ANOVA and Chisquared. It follows the frequentist approach to statistical inference and is part of the Statistics with R specialization. The course content (except R basics) is also covered by the freely available book OpenIntro Statistics.

Bayesian statistics. Many advanced machine learning and deep learning techniques are based on Bayesian inference. This course teaches the basics (Bayes’ theorem, conjugate models, Bayesian inference on discrete and continuous data, …) and compares them to the frequentist approach. Other topics like graphical models, approximate inference or sampling methods are not covered though. A good companion to this course is the book Doing Bayesian data analysis (see below). Before taking this course, you should be familiar with the frequentist approach.
Books

Machine learning  a probabilistic perspective. A comprehensive book on classical machine learning techniques. Its a mathheavy book that also contains many useful examples as well as exercises at the end of each chapter. All concepts and methods are explained in an excellent and easy to understand way, even for machine learning beginners, given basic familiarity with multivariate calculus, probability and linear algebra. The book focuses on Bayesian methods for machine learning but also covers the frequentist approach to some extend. Code examples are in MATLAB but there is also a (limited) Python port available.

Deep learning. A comprehensive book on deep learning. Part 1 covers machine learning basics. Part 2 covers deep neural network basics, convolutional neural networks (CNNs) and recurrent neural networks (RNNs). The content is comparable to that of the Deep Learning specialization but is presented in a more academic way. Part 3 covers more advanced topics such as autoencoders, representation learning, graphical model and deep generative models. Be prepared to use Machine learning  a probabilistic perspective as a companion ;) The Deep learning book is not for practitioners but a great theoretical overview of the field.

Handson machine learning with scikitlearn and Tensorflow. If you’ve already taken a first machine learning and deep learning course, this book is for you. It is packed with useful code examples and guidelines for realworld machine learning projects. Part 1 focuses on the implementation of classical machine learning models with scikitlearn. Part 2 focuses on deep learning with Tensorflow. In addition to CNNs and RNNs this part also has chapters on autoencoders and reinforcement learning. Both, theory and code examples are presented in a clear and concise way.

Deep learning with Python. Another excellent deep learning book for practitioners with code examples in Keras. Keras is a deep learning framework with a higherlevel API than Tensorflow (and other frameworks) and aims to enable rapid prototyping. In addition to a good coverage of CNNs and RNNs this book also has chapters on advanced deep learning best practices and generative deep learning. It is a good complement to part 2 of the previous book. If you are not sure which one is better to start with, I recommend this one as first steps are easier with Keras than with Tensorflow and you can get very far with Keras alone.

Introduction to statistical learning. If Machine learning  a probabilistic perspective is too mathheavy for you, this book is a good alternative. It covers statistical learning basics with a minimum of math but has the focus on the frequentist approach. It is also an excellent introduction to R. If you want to go deeper after having read this book, both in terms of mathematical details and number of approaches, I recommend The elements of statistical learning. Both books are also freely available as PDF (ISL, ESL).

Doing Bayesian data analysis. An excellent introduction to Bayesian statistics that prepares you well for reading more advanced literature in this field. It covers the Bayesian analogues to traditional statistical tests (t, ANOVA, Chisquared, …) and to multiple linear and logistic regression among many others. It requires only a basic knowledge of calculus. For me, the book was a helpful companion to Machine learning  a probabilistic perspective and Deep learning. Code examples are written in R using packages JAGS and Stan for MCMC sampling. There’s also a Python port available using PyMC3.

Data Science from Scratch. This book is about data science in its most distilled form. Don’t expect too much depth here but a great overview of data science topics such as probability and statistics, data preparation and machine learning basics. The book focuses on understanding fundamental data science tools by implementing them in plain Python from scratch. Common statistical and machine learning libraries are not used here but each chapter contains references to libraries you should actually use for your own projects. Each chapter also contains useful links for further reading.
There is also a large number of useful blogs and survey papers about machine learning which I’ll leave for a separate post. I nevertheless hope you find this a useful guide for getting started with machine learning.