-
Notifications
You must be signed in to change notification settings - Fork 23
Description
I love the aeson-esque interface that you've built but there's a glaring misstep in that the ToSeriesData is being treated a lot like tables of data instead of series of data.
Let's say I have a datatype that's an instance of ToSeriesData that looks something like this:
data Reading = Reading UTCTimeEpoch UUID DeviceType ReadingType Double
data DeviceType = Plug | Switch
data ReadingType = Watts | Volts | Temp
Let's say you have a series named "device_readings" that go into it. Works great at small-scale but the minute you reach millions of points you're suddenly hitting performance problems because InfluxDB isn't designed to handle that type of querying (SELECT uuid FROM device_readings WHERE ...) and filtering on, say, the device type column. You'll traverse the entire key space of that specific series to do that because underneath Influx is just a dumb key-value store.
If there's 20million keys in the device_readings series, that's really severe pain and you've just tremendously fucked yourself because migrating that data to another schema could take quite a bit of time...
This is my major beef with InfluxDB because they wanted to keep a "SQL like" interface to the data but the underlying model definitely will not handle the kind of queries that you CAN run on it. This is also their fault for not urgently writing up a document on "Schema Design".
TempoDB got it right. Your series name should contain the key, category, and attributes you want to "query".
So instead of a series name like: device_readings. It would instead look like: device_readings.2c9e4570-9b35-0131-c7ce-48e0eb16f719.Watts.Dimmer. You will then, efficiently, be able to query the data you want by being able to construct the key from known categories, ID's, and attributes. The datatype then looks extremely simple:
data Reading = Reading UTCTimeEpoch Double
What I would love to is a data type that can give us a structured and easy way of building series names from a key, a category, and some attributes! Which is what I wish this library was doing, instead of following a more table like model.
I'm going to throw together my ideas in a fork and see what you think of them. Because right now I'm building series names with functions and its ugly, I would rather do it with specialized data types and instances of a class like ToSeriesName or something similar.