Solving Détrak with brute force
optimization
llm
Nostalgia for a time I didn’t experience
showerthought
Row level lineage at Carbonfact
python
data-eng
No pain no startup
showerthought
Scraping Google Calendar events
python
scraping
Warmshowers sparks joy
bike-touring
showerthought
Do LLMs identify fonts?
llm
scraping
Thoughts on DuckLake
data-eng
The total derivative of a metric tree
data-science
Minimizing the runtime of a SQL DAG
data-engineering
python
Hard data integration problems at Carbonfact
data-science
Introducing icanexplain @ PyData Paris 2024
analytics-engineering
python
LCA software: exit the matrix
sustainability
python
Cutting up shoes to measure their footprint
sustainability
data-science
A training set for bike sharing forecasting
data-eng
machine-learning
Decomposing funnel metrics
data-science
Efficient ELT refreshes
data-eng
Online machine learning on the road @ IDE+A, TH Köln
online-machine-learning
Measuring the carbon footprint of pizzas
sustainability
python
Graph components with DuckDB
data-science
sql
For analytics, don't use dynamic JSON keys
data-eng
sql
Metric correctness doesn't matter, consistency does
data-science
Online gradient descent written in SQL
online-machine-learning
sql
Online active learning in 80 lines of Python
online-machine-learning
Are Airbnb guests less energy efficient than their host?
sustainability
data-science
The future of River
online-machine-learning
Parsing garment descriptions with GPT-3
text-processing
NLP at Carbonfact: how would you do it?
text-processing
Matrix inverse mini-batch updates
online-machine-learning
A rant against dbt ref
data-eng
sql
rant
First IRL meetup with the River developers
online-machine-learning
Online machine learning with River @ GAIA
online-machine-learning
Fuzzy regex matching in Python
text-processing
OCR spelling correction is hard
text-processing
Comic book panel segmentation
image-processing
Online machine learning in practice @ PyData PDX
online-machine-learning
The online machine learning predict/fit switcheroo
online-machine-learning
Online machine learning in practice @ Applied AI
online-machine-learning
Online machine learning in practice @ LVMH
online-machine-learning
Web scraping, upside down
scraping
One year at Alan
job-log
Manipulating ephemeral data with git
scraping
Dashboards and GROUPING SETS
data-eng
sql
Homoglyphs: different characters that look identical
text-processing
Automated document processing at Alan
text-processing
Text classification by data compression
machine-learning
text-processing
Reducing the memory footprint of a scikit-learn text classifier
machine-learning
text-processing
An overview of dataset time travel
data-eng
The challenges of online machine learning in production @ Itaú Unibanco
online-machine-learning
Converting Amazon Textract tables to pandas DataFrames
text-processing
What my PhD was about
job-log
Unsupervised text classification with word embeddings
machine-learning
text-processing
Focal loss implementation for LightGBM
machine-learning
A few intermediate pandas tricks
data-eng
A brief introduction to online machine learning @ Hong Kong Machine Learning Meetup
online-machine-learning
The correct way to evaluate online machine learning models
online-machine-learning
Online machine learning with decision trees @ Toulouse AOC workgroup
online-machine-learning
Our solution to the IDAO 2020 qualifiers
competitive-machine-learning
Speeding up scikit-learn for single predictions
machine-learning
Machine learning for streaming data with creme
online-machine-learning
Global explanation of machine learning with sensitivity analysis @ MASCOT-NUM
machine-learning
explainability
Bayesian linear regression for practitioners
machine-learning
Under-sampling a dataset with desired ratios
machine-learning
The benefits of online machine learning @ Quantmetry
online-machine-learning
The benefits of online machine learning @ Element AI
online-machine-learning
Finding fuzzy duplicates with pandas
data-eng
A smooth approach to putting machine learning into production
machine-learning
data-eng
The benefits of online machine learning @ Airbus Bizlab
online-machine-learning
Machine learning incrémental: des concepts à la pratique @ Toulouse Data Science Meetup
online-machine-learning
Skyline queries in Python
data-eng
Online machine learning with creme @ PyData Amsterdam
online-machine-learning
An approach based on Bayesian networks for query selectivity estimation @ DASFAA
selectivity-estimation
phd
Morellet crosses with JavaScript
generative-art
Streaming groupbys in pandas for big datasets
online-machine-learning
Target encoding done the right way
machine-learning
Stella triangles with JavaScript
generative-art
Unknown pleasures with JavaScript
generative-art
Subsampling a training set to match a test set - Part 1
machine-learning
Docker for data science @ HelloFresh Berlin
data-science
Halftoning with Go - Part 2
image-processing
Grid paintings à la Mondrian with JavaScript
generative-art
Challenge Big Data @ TSE
competitive-machine-learning
Halftoning with Go - Part 1
image-processing
Predire la disponibilité des Velib' @ Toulouse Data Science Meetup
data-science
machine-learning
data-viz
Recursive polygons with JavaScript
generative-art
The Naïve Bayes classifier
machine-learning
An introduction to genetic algorithms
machine-learning
Visualizing bike stations live data
data-viz