Max Halford 🦆Blog Links Bio2024-04-04A training set for bike sharing forecasting2024-02-27Fast Poetry and pre-commit with GitHub Actions2023-12-14Decomposing funnel metrics2023-12-01Efficient ELT refreshes2023-10-26Online machine learning on the road @ IDE+A, TH Köln2023-10-16Sh*t flows downhill, but not at Carbonfact2023-08-09Answering "Why did the KPI change?" using decomposition2023-06-25Measuring the carbon footprint of pizzas2023-06-03Graph components with DuckDB2023-05-11For analytics, don't use dynamic JSON keys2023-04-28Metric correctness doesn't matter, consistency does2023-03-07Online gradient descent written in SQL2023-02-15Using SymPy in Python doctests2023-01-22Online active learning in 80 lines of Python2023-01-17Are Airbnb guests less energy efficient than their host?2022-12-13The future of River2022-11-20Parsing garment descriptions with GPT-32022-09-25Dynamic on-screen TV keyboards2022-09-06NLP at Carbonfact: how would you do it?2022-08-24Matrix inverse mini-batch updates2022-06-28A rant against dbt ref2022-06-09First IRL meetup with the River developers2022-04-07Online machine learning with River @ GAIA2022-04-04Fuzzy regex matching in Python2022-03-06OCR spelling correction is hard2022-03-05Comic book panel segmentation2022-02-09Online machine learning in practice @ PyData PDX2022-01-06The online machine learning predict/fit switcheroo2021-12-24Weighted sampling without replacement in pure Python2021-12-17Online machine learning in practice @ Applied AI2021-12-10Online machine learning in practice @ LVMH2021-11-11Web scraping, upside down2021-10-26One year at Alan2021-10-07Manipulating ephemeral data with git2021-09-10Dashboards and GROUPING SETS2021-08-19Homoglyphs: different characters that look identical2021-06-10Automated document processing at Alan2021-06-08Text classification by data compression2021-04-11Reducing the memory footprint of a scikit-learn text classifier2021-04-07An overview of dataset time travel2021-02-26The challenges of online machine learning in production @ Itaú Unibanco2021-01-22Quelle est l’empreinte écologique du Big Data? @ Toulouse Tech2021-01-21Organising a Kaggle InClass competition with a fairness metric2021-01-14Converting Amazon Textract tables to pandas DataFrames2021-01-06What my PhD was about2020-11-17Computing cross-correlations in SQL2020-10-03Unsupervised text classification with word embeddings2020-09-20Focal loss implementation for LightGBM2020-08-17A few intermediate pandas tricks2020-06-10A brief introduction to online machine learning @ Hong Kong Machine Learning Meetup2020-06-07The correct way to evaluate online machine learning models2020-05-07Online machine learning with decision trees @ Toulouse AOC workgroup2020-05-04Server-sent events in Flask without extra dependencies2020-04-17I got plagiarized and Google didn't help2020-04-12Our solution to the IDAO 2020 qualifiers2020-03-31Speeding up scikit-learn for single predictions2020-03-26Machine learning for streaming data with creme2020-03-10Global explanation of machine learning with sensitivity analysis @ MASCOT-NUM2020-02-26Bayesian linear regression for practitioners2019-12-17Under-sampling a dataset with desired ratios2019-10-29The benefits of online machine learning @ Quantmetry2019-10-23The benefits of online machine learning @ Element AI2019-09-16Finding fuzzy duplicates with pandas2019-07-13A smooth approach to putting machine learning into production2019-06-28The benefits of online machine learning @ Airbus Bizlab2019-05-28Machine learning incrémental: des concepts à la pratique @ Toulouse Data Science Meetup2019-05-21Skyline queries in Python2019-05-11Online machine learning with creme @ PyData Amsterdam2019-05-06SQL subquery enumeration2019-04-22An approach based on Bayesian networks for query selectivity estimation @ DASFAA2019-02-03Morellet crosses with JavaScript2018-12-05Streaming groupbys in pandas for big datasets2018-10-13Target encoding done the right way2018-04-26Stella triangles with JavaScript2017-07-24Unknown pleasures with JavaScript2017-06-19Subsampling a training set to match a test set - Part 12017-06-01Docker for data science @ HelloFresh Berlin2017-03-20Halftoning with Go - Part 22017-03-04Grid paintings à la Mondrian with JavaScript2017-01-26A short introduction and conclusion to the OpenBikes 2016 Challenge2017-01-09Challenge Big Data @ TSE2016-11-27Halftoning with Go - Part 12016-03-30Predire la disponibilité des Velib' @ Toulouse Data Science Meetup2016-03-25Recursive polygons with JavaScript2015-09-10The Naïve Bayes classifier2015-08-02An introduction to genetic algorithms2015-07-14Setting up a droplet to host a Flask app2015-06-03Visualizing bike stations live data