Sample Datasets

We have created several sample datasets to help you get started using TimescaleDB. These datasets vary in database size, number of time intervals, and number of values for the partition field.

Each gzip archive contains a single .sql file to create the necessary (hyper)tables within the database, and several .csv files that contain the data to be copied into those tables. These files presume the database you are importing them to has already been set up with the TimescaleDB extension.

Device Ops: These datasets are designed to represent metrics (e.g. CPU, memory, network) collected from mobile devices. (Click on the name to download.)

  1. devices_smallOther sample datasets - 图1 - 1,000 devices recorded over 1,000 time intervals - 39MB
  2. devices_medOther sample datasets - 图2 - 5,000 devices recorded over 2,000 time intervals - 390MB
  3. devices_bigOther sample datasets - 图3 - 3,000 devices recorded over 10,000 time intervals - 1.2GB

For more details and example usage, see In-depth: Device ops datasets.

Weather: These datasets are designed to represent temperature and humidity data from a variety of locations. (Click on the name to download.)

  1. weather_smallOther sample datasets - 图4 - 1,000 locations over 1,000 two-minute intervals - 8.1MB
  2. weather_medOther sample datasets - 图5 - 1,000 locations over 15,000 two-minute intervals - 115MB
  3. weather_bigOther sample datasets - 图6 - 2,000 locations over 20,000 two-minute intervals - 305MB

For more details and example usage, see In-depth: Weather datasets.

Importing

Briefly, the import steps are:

  1. Setup a database with TimescaleDB.
  2. Unzip the archive.
  3. Import the .sql file to create the (hyper)tables via psql.
  4. Import the data from .csv files via psql.

Each dataset is named [dataset]_[size].tar.gz. For example, devices_small.tar.gz is dataset devices and size small. Each dataset contains one .sql file named [dataset].sql and a few CSV files named in the format [dataset]_[size]_[table].csv.

As an example, if you wanted to import the devices_small dataset above, it creates two tables (device_info and a hypertable named readings) from devices.sql. Therefore, there are two CSV files: devices_small_readings.csv and devices_small_device_info.csv. So, to import this dataset into a TimescaleDB database named devices_small:

  1. # (1) unzip the archive
  2. tar -xvzf devices_small.tar.gz
  3. # (2) import the .sql file to the database
  4. psql -U postgres -d devices_small < devices.sql
  5. # (3) import data from .csv files to the database
  6. psql -U postgres -d devices_small -c "\COPY readings FROM devices_small_readings.csv CSV"
  7. psql -U postgres -d devices_small -c "\COPY device_info FROM devices_small_device_info.csv CSV"

The data is now ready for you to use.

TIP:The standard COPY command in PostgreSQL is single threaded. So to speed up importing the larger sample datasets, we recommend using our parallel importer instead.

  1. # To access your database (e.g., devices_small)
  2. psql -U postgres -h localhost -d devices_small

In-depth: Device Ops Datasets

After importing one of these datasets (devices_small, devices_med, devices_big), you will find a plain PostgreSQL table called device_info and a hypertable called readings. The device_info table has (static) metadata about each device, such as the OS name and manufacturer. The readings hypertable tracks data sent from each device, e.g. CPU activity, memory levels, etc. Because hypertables are exposed as a single table, you can query them and join them with the metadata as you would normal SQL tables (see Example Queries below).

Schemas

  1. Table "public.device_info"
  2. Column | Type | Modifiers
  3. -------------+------+-----------
  4. device_id | text |
  5. api_version | text |
  6. manufacturer | text |
  7. model | text |
  8. os_name | text |
  1. Table "public.readings"
  2. Column | Type | Modifiers
  3. --------------------+------------------+-----------
  4. time | bigint |
  5. device_id | text |
  6. battery_level | double precision |
  7. battery_status | text |
  8. battery_temperature | double precision |
  9. bssid | text |
  10. cpu_avg_1min | double precision |
  11. cpu_avg_5min | double precision |
  12. cpu_avg_15min | double precision |
  13. mem_free | double precision |
  14. mem_used | double precision |
  15. rssi | double precision |
  16. ssid | text |
  17. Indexes:
  18. "readings_device_id_time_idx" btree (device_id, "time" DESC)
  19. "readings_time_idx" btree ("time" DESC)

Example Queries

Note: Uses dataset devices_med

10 most recent battery temperature readings for charging devices

  1. SELECT time, device_id, battery_temperature
  2. FROM readings
  3. WHERE battery_status = 'charging'
  4. ORDER BY time DESC LIMIT 10;
  5. time | device_id | battery_temperature
  6. -----------------------+------------+---------------------
  7. 2016-11-15 23:39:30-05 | demo004887 | 99.3
  8. 2016-11-15 23:39:30-05 | demo004882 | 100.8
  9. 2016-11-15 23:39:30-05 | demo004862 | 95.7
  10. 2016-11-15 23:39:30-05 | demo004844 | 95.5
  11. 2016-11-15 23:39:30-05 | demo004841 | 95.4
  12. 2016-11-15 23:39:30-05 | demo004804 | 101.6
  13. 2016-11-15 23:39:30-05 | demo004784 | 100.6
  14. 2016-11-15 23:39:30-05 | demo004760 | 99.1
  15. 2016-11-15 23:39:30-05 | demo004731 | 97.9
  16. 2016-11-15 23:39:30-05 | demo004729 | 99.6
  17. (10 rows)

Busiest devices (1 min avg) whose battery level is below 33% and is not charging

  1. SELECT time, readings.device_id, cpu_avg_1min,
  2. battery_level, battery_status, device_info.model
  3. FROM readings
  4. JOIN device_info ON readings.device_id = device_info.device_id
  5. WHERE battery_level < 33 AND battery_status = 'discharging'
  6. ORDER BY cpu_avg_1min DESC, time DESC LIMIT 5;
  7. time | device_id | cpu_avg_1min | battery_level | battery_status | model
  8. -----------------------+------------+--------------+---------------+----------------+---------
  9. 2016-11-15 23:30:00-05 | demo003764 | 98.99 | 32 | discharging | focus
  10. 2016-11-15 22:54:30-05 | demo001935 | 98.99 | 30 | discharging | pinto
  11. 2016-11-15 19:10:30-05 | demo000695 | 98.99 | 23 | discharging | focus
  12. 2016-11-15 16:46:00-05 | demo002784 | 98.99 | 18 | discharging | pinto
  13. 2016-11-15 14:58:30-05 | demo004978 | 98.99 | 22 | discharging | mustang
  14. (5 rows)
  1. SELECT date_trunc('hour', time) "hour",
  2. min(battery_level) min_battery_level,
  3. max(battery_level) max_battery_level
  4. FROM readings r
  5. WHERE r.device_id IN (
  6. SELECT DISTINCT device_id FROM device_info
  7. WHERE model = 'pinto' OR model = 'focus'
  8. ) GROUP BY "hour" ORDER BY "hour" ASC LIMIT 12;
  9. hour | min_battery_level | max_battery_level
  10. -----------------------+-------------------+-------------------
  11. 2016-11-15 07:00:00-05 | 17 | 99
  12. 2016-11-15 08:00:00-05 | 11 | 98
  13. 2016-11-15 09:00:00-05 | 6 | 97
  14. 2016-11-15 10:00:00-05 | 6 | 97
  15. 2016-11-15 11:00:00-05 | 6 | 97
  16. 2016-11-15 12:00:00-05 | 6 | 97
  17. 2016-11-15 13:00:00-05 | 6 | 97
  18. 2016-11-15 14:00:00-05 | 6 | 98
  19. 2016-11-15 15:00:00-05 | 6 | 100
  20. 2016-11-15 16:00:00-05 | 6 | 100
  21. 2016-11-15 17:00:00-05 | 6 | 100
  22. 2016-11-15 18:00:00-05 | 6 | 100
  23. (12 rows)

In-depth: Weather Datasets

After importing one of these datasets (weather_small, weather_med, weather_big), you will find a plain PostgreSQL table called locations and a hypertable called conditions. The locations table has metadata about each of the locations, such as its name and environmental type. The conditions hypertable tracks readings of temperature and humidity from those locations. Because hypertables are exposed as a single table, you can query them and join them with the metadata as you would normal SQL tables (see Example Queries below).

Schemas

  1. Table "public.locations"
  2. Column | Type | Modifiers
  3. ------------+------+-----------
  4. device_id | text |
  5. location | text |
  6. environment | text |
  1. Table "public.conditions"
  2. Column | Type | Modifiers
  3. ------------+--------------------------+-----------
  4. time | timestamp with time zone | not null
  5. device_id | text |
  6. temperature | double precision |
  7. humidity | double precision |
  8. Indexes:
  9. "conditions_device_id_time_idx" btree (device_id, "time" DESC)
  10. "conditions_time_idx" btree ("time" DESC)

Example Queries

Note: Uses dataset weather_med

Last 10 readings

  1. SELECT * FROM conditions c ORDER BY time DESC LIMIT 10;
  2. time | device_id | temperature | humidity
  3. -----------------------+--------------------+--------------------+--------------------
  4. 2016-12-06 02:58:00-05 | weather-pro-000000 | 84.10000000000034 | 83.70000000000053
  5. 2016-12-06 02:58:00-05 | weather-pro-000001 | 35.999999999999915 | 51.79999999999994
  6. 2016-12-06 02:58:00-05 | weather-pro-000002 | 68.90000000000006 | 63.09999999999999
  7. 2016-12-06 02:58:00-05 | weather-pro-000003 | 83.70000000000041 | 84.69999999999989
  8. 2016-12-06 02:58:00-05 | weather-pro-000004 | 83.10000000000039 | 84.00000000000051
  9. 2016-12-06 02:58:00-05 | weather-pro-000005 | 85.10000000000034 | 81.70000000000017
  10. 2016-12-06 02:58:00-05 | weather-pro-000006 | 61.09999999999999 | 49.800000000000026
  11. 2016-12-06 02:58:00-05 | weather-pro-000007 | 82.9000000000004 | 84.80000000000047
  12. 2016-12-06 02:58:00-05 | weather-pro-000008 | 58.599999999999966 | 40.2
  13. 2016-12-06 02:58:00-05 | weather-pro-000009 | 61.000000000000014 | 49.399999999999906
  14. (10 rows)

Last 10 readings from ‘outside’ locations

  1. SELECT time, c.device_id, location,
  2. trunc(temperature, 2) temperature, trunc(humidity, 2) humidity
  3. FROM conditions c
  4. INNER JOIN locations l ON c.device_id = l.device_id
  5. WHERE l.environment = 'outside'
  6. ORDER BY time DESC LIMIT 10;
  7. time | device_id | location | temperature | humidity
  8. -----------------------+--------------------+---------------+-------------+----------
  9. 2016-12-06 02:58:00-05 | weather-pro-000000 | field-000000 | 84.10 | 83.70
  10. 2016-12-06 02:58:00-05 | weather-pro-000001 | arctic-000000 | 35.99 | 51.79
  11. 2016-12-06 02:58:00-05 | weather-pro-000003 | swamp-000000 | 83.70 | 84.69
  12. 2016-12-06 02:58:00-05 | weather-pro-000004 | field-000001 | 83.10 | 84.00
  13. 2016-12-06 02:58:00-05 | weather-pro-000005 | swamp-000001 | 85.10 | 81.70
  14. 2016-12-06 02:58:00-05 | weather-pro-000007 | field-000002 | 82.90 | 84.80
  15. 2016-12-06 02:58:00-05 | weather-pro-000014 | field-000003 | 84.50 | 83.90
  16. 2016-12-06 02:58:00-05 | weather-pro-000015 | swamp-000002 | 85.50 | 66.00
  17. 2016-12-06 02:58:00-05 | weather-pro-000017 | arctic-000001 | 35.29 | 50.59
  18. 2016-12-06 02:58:00-05 | weather-pro-000019 | arctic-000002 | 36.09 | 48.80
  19. (10 rows)

Hourly average, min, and max temperatures for “field” locations

  1. SELECT date_trunc('hour', time) "hour",
  2. trunc(avg(temperature), 2) avg_temp,
  3. trunc(min(temperature), 2) min_temp,
  4. trunc(max(temperature), 2) max_temp
  5. FROM conditions c
  6. WHERE c.device_id IN (
  7. SELECT device_id FROM locations
  8. WHERE location LIKE 'field-%'
  9. ) GROUP BY "hour" ORDER BY "hour" ASC LIMIT 24;
  10. hour | avg_temp | min_temp | max_temp
  11. -----------------------+----------+----------+----------
  12. 2016-11-15 07:00:00-05 | 73.80 | 68.00 | 79.09
  13. 2016-11-15 08:00:00-05 | 74.80 | 68.69 | 80.29
  14. 2016-11-15 09:00:00-05 | 75.75 | 69.39 | 81.19
  15. 2016-11-15 10:00:00-05 | 76.75 | 70.09 | 82.29
  16. 2016-11-15 11:00:00-05 | 77.77 | 70.79 | 83.39
  17. 2016-11-15 12:00:00-05 | 78.76 | 71.69 | 84.49
  18. 2016-11-15 13:00:00-05 | 79.73 | 72.69 | 85.29
  19. 2016-11-15 14:00:00-05 | 80.72 | 73.49 | 86.99
  20. 2016-11-15 15:00:00-05 | 81.73 | 74.29 | 88.39
  21. 2016-11-15 16:00:00-05 | 82.70 | 75.09 | 88.89
  22. 2016-11-15 17:00:00-05 | 83.70 | 76.19 | 89.99
  23. 2016-11-15 18:00:00-05 | 84.67 | 77.09 | 90.00
  24. 2016-11-15 19:00:00-05 | 85.64 | 78.19 | 90.00
  25. 2016-11-15 20:00:00-05 | 86.53 | 78.59 | 90.00
  26. 2016-11-15 21:00:00-05 | 86.40 | 78.49 | 90.00
  27. 2016-11-15 22:00:00-05 | 85.39 | 77.29 | 89.30
  28. 2016-11-15 23:00:00-05 | 84.40 | 76.19 | 88.70
  29. 2016-11-16 00:00:00-05 | 83.39 | 75.39 | 87.90
  30. 2016-11-16 01:00:00-05 | 82.40 | 74.39 | 87.10
  31. 2016-11-16 02:00:00-05 | 81.40 | 73.29 | 86.29
  32. 2016-11-16 03:00:00-05 | 80.38 | 71.89 | 85.40
  33. 2016-11-16 04:00:00-05 | 79.41 | 70.59 | 84.40
  34. 2016-11-16 05:00:00-05 | 78.39 | 69.49 | 83.60
  35. 2016-11-16 06:00:00-05 | 78.42 | 69.49 | 84.40
  36. (24 rows)