Visualizing bike stations live data
Recently some friends and I decided to launch openbikes.co, a website for visualizing (and later on analyzing) urban bike traffic. We have a lot of ideas that we will progressively implement. Anyway, the point is that all of it started with me fiddling about with the JCDecaux API and the leaflet.js library and I would like to share it with you. Shall we?
In this post I want to show you the tools and the code to get a fully functional website for visualizing live data. In this particular case we will display bike stations in Toulouse, however I will keep the scripts as general as possible so they are easily modifiable for different data. Before starting here is a glimpse of the end result:
Pretty neat right? The marker color represents how many bikes are available and the circle in the center of each marker shows how many bike stands are available. In both cases, the greener the marker the higher the amount (a red) indicator means not a lot. Thus yellow means that the bike station is “balanced”. The cool thing is that you can set this up to update every minute or so.
I’ll be doing this in Python (3, but it shouldn’t matter). Only two modules are needed: pandas (manipulating data frames) and flask (making and hosting a website). Both should be very easy to install on any platform as they are extremely popular modules.
Download the leaflet vector marker library (click on
Data will be collected from the JCDecaux API, if you don’t know what an API is don’t worry. However you will have to obtain a registration key because I cannot give you mine. The reason why most APIs require users to have a registration key is so bots can’t overflow their servers and slow/crash them.
Prior on APIs
Very briefly, an API returns data to a user based on a query that was sent to it. The queries take the shape of URLs. In each URL you can specify parameters so as to retrieve different data. Most of the time the API will return the data as a JSON file (I haven’t encountered any other formats yet but they do exist) so knowing how to parse them is important. What’s cool with Python is that JSON files are exactly like dictionaries.
Just like a website, an API is hosted on a server, which means that technically it could “break” if too many people query the API too quickly. For the API there isn’t much difference between it sending all of its data or just piece of it. What matters is how many times you interrogate it, not for how much. In our case we may as well interrogate the JCDecaux API on all of the stations for a city and not iteratively for every station (it would be longer and they wouldn’t be happy).
As general politeness it’s always a good idea to send a mail to the people who made the API to ask them how robust the API and how regularly queries can be sent to it. Also you should thank them for their awesome work :). Sometimes they indicate all of the information on their website. They also should indicate what each variable corresponds to in the data the API returns because it always also very obvious.
First of all let’s define an elegant structure for the project:
bikes/ static/ js/ Leaflet.vector-markers.min.js Leaflet.vector-markers.js data/ Toulouse.csv css/ Leaflet.vector-markers.css Leaflet.vector-markers.css.map lib/ __init__.py JCDecaux.py templates/ index.html serve.py update.py
The website will be hosted with flask, with this particular framework all the files related to the website are either in the
static folder or in the
templates folder regroups the pages of the website. In this case there is only one page. However if we were to add more cities we could just add an HTML file for each city.
The rest of the files are Python scripts for manipulating the data, I’ll get to them now.
Collecting the data
JCDecaux provides a very nice API for obtaining data about bikes stations in real time. My idea on the long term was to create a package of scripts for collecting data from different APIs and making it all uniform. In this case the package is called
lib. For it to be a real Python package you have to add an empty file called
__init__.py. In this file you can create a script called
JCDecaux.py and then open it.
First of all let’s import the modules we need to deal with the API (all of them apart from pandas should be available by default with your Python distribution).
Next let’s define some variables.
Add the key (between quotes) that was given to you after you registered to JCDecaux. The
base variable is the first part of the API, it doesn’t change for any query and so it’s better to save it to a variable.
Now let’s define a function for interrogating the API.
If you pass into this function a URL, it will return a dictionary containing all the data as a dictionary.
Next we need a function that interrogates the API for all the stations in a given city.
First we specify the URL to send to the API. In the URL we specify the city name (JCDecaux calls this a contract) and the API key that we stored previously. Then we can use the function we made before to send to URL to the API and get the data we want.
Here is an example of it actually returns:
This is data for one station, the query we send actually returns a list of these.
You might notice that JCDecaux’s API return timestamps for indicating at which time each station was last updated. These can easily be converted to a
Now let’s put everything together!
I think the function and the comments speak for themselves but I will go through it in plain english. The function takes a city as a parameter and returns a dataframe. We start by collecting the API as a JSON file with the script we wrote before. As you can see in the example the position of each station is embedded, which means that converting the dictionary to a dataframe doesn’t work straight away and puts both into a single column. However as the code shows this isn’t difficult to circumvent. The dataframes provided by pandas are really useful, converting all the timestamps to a format comprehensible by humans is a one-liner. with
.apply. Finally we decide to only return the columns that we will use, indeed as you can see in the example above the API returns more information than we need.
We’re done with the API interrogation! Now we can create a
update.py script in the parent folder and add the following code.
In this script we use the the code we wrote previously as a module. This makes it really tidy. If we had other scripts for interrogating different APIs we could just import them in this “main” script and voilà. The script itself is not too complicated, we simply loop forever and interrogate the API every 60 seconds (change it as you please). The dataframe we build is saved to
static folder which flask will be able to read. The architecture of the project is efficient, indeed instead of just one city we could add an inner
for loop the
while loop and iterate over a list of cities without having to change the
Making a map
Let’s add a file called
index.html to the
templates folder, this will contain the map of Toulouse but il will easily be adaptable for other cities.
First of all let’s import all the libraries we will need.
Leaflet is the “main” library we will use for displaying maps. It’s lightweight and has a lot of plugins. Most of the plugins and leaflet itself are available for importing online, in other words you don’t need to have as physical files. The downside of this is that the script are slightly lower to load the first time you open the HTML file in your browser. However the file will load faster the next times you open it because your browser will remember opening it. In our case only the the vector markers library isn’t available online.
For the sake of showing leaflet’s possibilities let’s add two plugins.
These specific plugins add two buttons to the top-left of the map, one puts the map to fullscreen and the other pinpoints the user’s location on the map with a blue circle.
We create the map by creating a
<div> object that will cover the whole page.
Next let’s edit the map.
We start by defining the map and giving the
emerald style available on mapbox. On mapbox you can also create your own maps but that’s another matter altogether. Then we center it on Toulouse (use google for other cities). Finally we activate the two plugins we imported.
Now the last piece of puzzle is to convert the CSV files we generate with Python into markers on the map. For this we use omnivore library. omnivore is very smart, you just have to point it to a CSV file containing columns resembling “latitude” or “lat” (the same goes for the longitudes) and it will do all the work in the background.
info variable stores the data from all the columns for each row. We define an
icon variable where we use the
getColor() function defined earlier to color the marker and the circle in the middle of the marker based on the relative number of bikes or bike stands available. Then we tell omnivore that the marker inherits it’s graphics from the icon we just created. Finally we create a
popupText variable for when the user clicks on a marker and bind it to the marker.
Don’t forget to close the HTML tags :).
Running the website
We’re nearly done, we just have to run the website thanks to flask. The reason why we can’t just open the
index.html is because the script has to get data from a folder, and a browser on it’s own can’t do that. Another way of doing this would be to put everything in the
htdocs folder of Apache and navigate to it in the browser, however I find that Apache is annoying to manipulate and sharing code with other people can become annoying.
serve.py file and add the following code.
The point of this blog post isn’t to explain Flask to you, if you want to learn more about it visit their website, it’s very well made. If you run this script and type
localhost:5000 in your browser you should see the map!
Flask really shines here, if I want to send this to one of my friends so he/she can edit it (for example add more maps) he/she won’t have to bother with Apache.
To sum things up, if you want to run the website constantly you have to execute
serve.py without stoping them. To do this properly and also to have a real URL point to the website you will have run all this on a Linux server with daemons. However all this enters another realm and is out of the scope of this post.
I hope you enjoyed the post.