Max Halford Blog Links Bio Converting Amazon Textract tables to pandas DataFrames 2021-01-14 What my PhD was about 2021-01-06 Computing cross-correlations in SQL 2020-11-17 Classifying documents without any training data 2020-10-03 Focal loss implementation for LightGBM 2020-09-20 A few intermediate pandas tricks 2020-08-17 Online vs. stochastic learning 2020-06-22 The correct way to evaluate online machine learning models 2020-06-07 Server-sent events in Flask without extra dependencies 2020-05-04 I got plagiarized and Google didn't help 2020-04-17 Speeding up scikit-learn for single predictions 2020-03-31 Bayesian linear regression for practitioners 2020-02-26 Under-sampling a dataset with desired ratios 2019-12-17 Finding fuzzy duplicates with pandas 2019-09-16 A smooth approach to putting machine learning into production 2019-07-13 Skyline queries in Python 2019-05-21 SQL subquery enumeration 2019-05-06 Morellet crosses with JavaScript 2019-02-03 Streaming groupbys in pandas for big datasets 2018-12-05 Target encoding done the right way 2018-10-13 Stella triangles with JavaScript 2018-04-26 Unknown pleasures with JavaScript 2017-07-24 Subsampling a training set to match a test set - Part 1 2017-06-19 Halftoning with Go - Part 2 2017-03-20 Grid paintings à la Mondrian with JavaScript 2017-03-04 A short introduction and conclusion to the OpenBikes 2016 Challenge 2017-01-26 Halftoning with Go - Part 1 2016-11-27 Recursive polygons with JavaScript 2016-03-25 The Naïve Bayes classifier 2015-09-10 An introduction to genetic algorithms 2015-08-02 Setting up a droplet to host a Flask app 2015-07-14 Visualizing bike stations live data 2015-06-03