In case you hadn't figured it out by the dedicated menu dropdown, we like stats. The most interesting (and most popular) stats we track are of how many concurrent players are using a specific app at a specific time. As you can probably imagine, this is a lot of data (~19000 points every 10 minutes, ~3 million points daily) that increases exponentially with every app added to Steam. On top of that, we also store information retrieved from our friends over at SteamSpy.
This is a little story on why we moved our graph store from RRDtool to InfluxDB. Both are time series databases, and both are open source, but work in completely different ways.
One obvious limitation of using RRDtool is that, as the name implies, it is a round-robin database. The database size is defined on creation, and it's limited to that (resizing is not straight forward). We used to store daily data for 2 years, and high resolution data for one week. InfluxDB has no such limitations, and allows storing unlimited amount of data at any resolution. Data aggregation and transformation is performed at SELECT
time, which allows us to easily group data in different ways (e.g. daily, weekly, monthly, etc).
InfluxDB has a query language which is very similar to normal SQL, and allows performing some interesting transformations. Example query which we use to display daily players graph for a game:
SELECT time, MAX(value) FROM concurrent_players_daily WHERE appid = '730' AND time < now() - 1d GROUP BY time(1d) fill(linear)
This query groups data per day, and linearly fills in missing data (only useful for data older than 2015, as we imported that data from different sources). concurrent_players_daily
measurement holds maximum peaks for each day, which is down sampled from concurrent_players
which holds high resolution data (every 10 minutes). Data is automatically down sampled by using a continuous query. Even though the daily measurement contains only one data point per day, using GROUP
normalises it and fills in the holes.
Another down side of using RRDtool was storage it self, we had one file per app, which means we had a little over 19,000 files in a folder.
And a final nail in the coffin of RRDtool was handling of storing values as floats (or at least it appeared that way), which produced some interesting side effects on the site like overall players peak being higher that 24 hour peak (off by one or two).
Even though using InfluxDB adds another service to keep alive, the overall graph tracking stack has been simplified, as we used to call RRDtool via a PHP module which is not exactly the best API to work with.
With the switch to InfluxDB, we are now able to keep track of players for more than two years. In addition to this, we can now easily backfill historical data, something not easy to do in RRDtool. We imported data from various sources going back to around 2011, and for Steam back to 2008 by using archive.org data. If it proves worth it, we might try backfilling even more data in the future.