## Papers

- An Approach Based on Bayesian Networks for Query Selectivity Estimation - DASFAA, 2019
- Master 2 year internship at HelloFresh (report, slides)
- Master 1 year internship at Privateaser (report, slides)
- Undergraduate internship at INSA Toulouse (report, slides)
- Detailed solutions to the first 30 Project Euler problems

## Talks

- Online machine learning with decision trees - Toulouse AOC workgroup, 2020
- Our solution to the IDAO 2020 qualifiers - virtual seminar, 2020
- Global explanation of machine learning with sensitivity analysis - MASCOT-NUM, Paris, 2020
- The benefits of online learning - Quantmetry, Paris, 2019
- The benefits of online learning - Element AI, London, 2019
- The benefits of online learning - Airbus BizLab, Toulouse, 2019
- An approach based on Bayesian networks for query selectivity estimation - DASFAA, 2019
- Machine learning incrémental: des concepts à la pratique - Toulouse Data Science, 2019
- Online machine learning with creme - PyData, Amsterdam, 2019
- Docker for data science - HelloFresh, Berlin, 2017
- Challenge Big Data - Toulouse, 2017
- Forecasting bicycle-sharing usage - Toulouse Data Science, 2016

## Datasets

## Blogroll

This is a list of blogs I regularly scroll through.

- Tim Salimans on Data Analysis
- Randal Olson
- Sam & Max – French and NSFW!
- Sebastian Raschka
- Clean Coder
- Pythonic Perambulations
- Erik Bernhardsson
- otoro
- Terra Incognita
- Real Python
- Airbnb Engineering
- No Free Hunch
- The Unofficial Google Data Science Blog
- will wolf
- Edwin Chen
- Use the index, Luke!
- Jack Preston
- Agustinus Kristiadi
- DataGenetics
- Katherine Bailey
- Netflix Research
- inFERENce
- Hyndsight – Rob Hyndman is a time series specialist.
- While My MCMC Gently Samples
- Ines Montani – By one of the founders of spaCy.
- Stephen Smerity
- Peter Norvig
- IT Best Kept Secret Is Optimization – By Jean-Francois Puget, aka CPMP.
- explained.ai
- Better Explained
- Genetic Argonaut
- pandas blog
- Towards Data Science
- Linear Disgressions – data science podcasts.
- Not so standard deviations – more podcasts.
- Probably Overthinking It
- Simply Statistics
- Practically Predictable
- koaning – By Vincent Warmerdam, who did this great presentation.
- blogarithms
- Possibly Wrong
- FastML
- Parameter-free Learning and Optimization Algorithms
- Todd W. Schneider – This guy is really good at exploratory data analysis.
- Yann Thaddée – Not directly related to data science but interesting nonetheless.
- Colins Blog
- Fabien Sanglard – Nothing to do with data science, but such good taste!
- The Glowing Python – By the creator of MiniSom, which is worth checking out too.

## Hall of fame

The following is a hall of fame of papers, books, and blog posts that have a very high signal to noise ratio – at least in my book. I highly recommend reading some of them when you get time.

- The Elements of Statistical Learning - Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie
- Machine Learning - Tom Mitchell – I think this wonderful textbook is under-appreciated.
- Artificial Intelligence: A Modern Approach - Russel & Norvig
- mlcourse.ai – Of all the introductions to machine learning I think this is the one that strikes the best balance between theory and practice.
- Machine learning cheat sheets - Shervine Amidi
- Kalman and Bayesian Filters in Python - Roger Labbe – Kalman filters are notoriously hard to grok, this tutorial nicely builds up the steps to understanding them.
- CS231n Convolutional Neural Networks for Visual Recognition - Stanford
- Algorithmes d’optimisation non-linéaire sans contrainte (French) - Michel Bergmann
- Graphical Models in a Nutshell - Koller et al.
- Rules of Machine Learning: Best Practices for ML Engineering - Martin Zinkevich – You should read this once a year.
- A Few Useful Things to Know about Machine Learning - Pedro Domingos – This short paper summarizes basic truths in machine learning.
- Choose Boring Technology - Dan McKinley
- How to Write a Spelling Corrector - Peter Norvig – Magic in 36 lines of code.
- MCMC sampling for dummies - Thomas Wiecki
- Your Easy Guide to Latent Dirichlet Allocation
- An Intuitive Explanation of Convolutional Neural Networks - Ujjwal Karn
- An overview of gradient descent optimization algorithms - Sebastian Ruder
- How to explain gradient boosting - Terence Parr and Jeremy Howard – A very good introduction to vanilla gradient boosting with step by step examples.
- Why Does XGBoost Win “Every” Machine Learning Competition? - Didrik Nielsen – This Master’s thesis goes into some of the details of XGBoost without being too bloated.
- Good sleep, good learning, good life - Piotr Wozniak – Extremely long and nothing to do with data science, but a very thorough essay nonetheless on how to properly sleep.
- Make for data scientists - Paul Butler – I believe Makefiles are yet to be rediscovered for managing data science pipelines.
- Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations – Just read it.
- The Cramér-Rao Lower Bound on Variance: Adam and Eve’s “Uncertainty Principle” - Michael Powers
- Kaggle contest on Observing Dark World - Cam Davidson-Pilon – If you’re not convinced about the power of Bayesian machine learning then read this and get your mind blown.
- A Concrete Introduction to Probability (using Python) - Peter Norvig – Extremely elegant Python coding.
- The Hungarian Maximum Likelihood Trick - Louis Abraham
- Machine Learning for Signal Processing - University of Illinois
- Don’t Call Yourself A Programmer, And Other Career Advice
- Tidy Data - Hadley Wickham – If you like playing with data then you need to be aware of this one.
- Gaussian Process, not quite for dummies - Yuge Shi – Gaussian processes are quite difficult to understand (at least, for me) but Yuge gives some great visual intuitions.
- Continuous Delivery for Machine Learning - Martin Fowler
- Memos - Sriram Krishnan
- Frequentism and Bayesianism: A Python-driven Primer - Jake VanderPlas
- A Few Useful Things To Know About Machine Learning - Pedro Domingos
- Multiworld Testing Decision Service: A System for Experimentation, Learning, And Decision-Making
- Machine Learning:The High-Interest Credit Card of Technical Debt - Google
- Variational Inference: A Review for Statisticians - David Blei and his flock
- The Performance of Decision Tree Evaluation Strategies - Andrew Tulloch
- Hidden Technical Debt in Machine Learning Systems - Google
- Distill: Why do we need Flask, Celery, and Redis? (with McDonalds in Between) - Lj Miranda – A good example of the difference between abstract ideas and implementation details.
- Darts, Dice, and Coins: Sampling from a Discrete Distribution – Keith Schwarz
- Simplifying Graph Convolutional Networks – Felix Wu et al. – A nice example of putting the horse before the cart.
- MIT 6.867 machine learning course notes – Tommi Jaakola – For people who enjoy concise mathematical notation.

## Eye candy

- Tyler Hobbs – The god of generative arts.
- Some Jean Giraud stuff
- Mauro Martins
- A new way to knit by Petros Vrellis
- A fascinating article about Manolo Gamboa Naon
- Some Ukiyo-e
- Turtletoy
- Dwitter
- generated.space
- Pixel art by Marcus Blättermann
- Nick Barnes’ football bible
- Simon Stålenhag
- Syd Mead (who worked on Blade Runner)
- Michael Fogleman’s blog
- World of Warcraft art by Dreamwalker