- Cost models in database query optimisation bibliography
- Detailed solutions to the first 30 Project Euler problems
Internship hand ins
Hall of fame
The following is a hall of fame of papers, books, and blog posts that have a very high signal to noise ratio and that I thoroughly recommend.
- The Elements of Statistical Learning - Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie
- Machine Learning - Tom Mitchell – I think this wonderful textbook is underappreciated.
- Artificial Intelligence: A Modern Approach - Russel & Norvig
- mlcourse.ai – of all the introductions to machine learning I think this is the one that strikes the best balance between theory and code
- Machine learning cheat sheets - Shervine Amidi
- Kalman and Bayesian Filters in Python - Roger Labbe – Kalman filters are notoriously hard to grok, this tutorial nicely builds up the steps to understanding them.
- CS231n Convolutional Neural Networks for Visual Recognition - Stanford
- Algorithmes d’optimisation non-linéaire sans contrainte (French) - Michel Bergmann
- Graphical Models in a Nutshell - Koller et al.
- Rules of Machine Learning: Best Practices for ML Engineering - Martin Zinkevich – you should read this once a year.
- A Few Useful Things to Know about Machine Learning - Pedro Domingos – this short paper summarizes basic truths in machine learning.
- Choose Boring Technology - Dan McKinley
- How to Write a Spelling Corrector - Peter Norvig – magic in 36 lines of code.
- MCMC sampling for dummies - Thomas Wiecki
- Your Easy Guide to Latent Dirichlet Allocation
- An Intuitive Explanation of Convolutional Neural Networks - Ujjwal Karn
- An overview of gradient descent optimization algorithms - Sebastian Ruder
- How to explain gradient boosting - Terence Parr and Jeremy Howard – a very good introduction to vanilla gradient boosting with step by step examples.
- Why Does XGBoost Win “Every” Machine Learning Competition? - Didrik Nielsen – this Master’s thesis goes into some of the details of XGBoost without being too bloated.
- Good sleep, good learning, good life - Piotr Wozniak – extremely long and nothing to do with data science, but a very good essay on how to properly sleep.
- Make for data scientists - Paul Butler – I believe Makefiles are yet to be rediscovered for managing data science pipelines.