Flux vs InfluxQL

Flux is an alternative to InfluxQL and other SQL-like query languages for querying and analyzing data. Flux uses functional language patterns making it incredibly powerful, flexible, and able to overcome many of the limitations of InfluxQL. This article outlines many of the tasks possible with Flux but not InfluxQL and provides information about Flux and InfluxQL parity.

Possible with Flux

Joins

InfluxQL has never supported joins. They can be accomplished using TICKscript, but even TICKscript’s join capabilities are limited. Flux’s join() function allows you to join data from any bucket, any measurement, and on any columns as long as each data set includes the columns on which they are to be joined. This opens the door for really powerful and useful operations.

  1. dataStream1 = from(bucket: "bucket1")
  2. |> range(start: -1h)
  3. |> filter(fn: (r) =>
  4. r._measurement == "network" and
  5. r._field == "bytes-transferred"
  6. )
  7. dataStream2 = from(bucket: "bucket1")
  8. |> range(start: -1h)
  9. |> filter(fn: (r) =>
  10. r._measurement == "httpd" and
  11. r._field == "requests-per-sec"
  12. )
  13. join(
  14. tables: {d1:dataStream1, d2:dataStream2},
  15. on: ["_time", "_stop", "_start", "host"]
  16. )

For an in-depth walkthrough of using the join() function, see How to join data with Flux.


Math across measurements

Being able to perform cross-measurement joins also allows you to run calculations using data from separate measurements – a highly requested feature from the InfluxData community. The example below takes two data streams from separate measurements, mem and processes, joins them, then calculates the average amount of memory used per running process:

  1. // Memory used (in bytes)
  2. memUsed = from(bucket: "telegraf/autogen")
  3. |> range(start: -1h)
  4. |> filter(fn: (r) =>
  5. r._measurement == "mem" and
  6. r._field == "used"
  7. )
  8. // Total processes running
  9. procTotal = from(bucket: "telegraf/autogen")
  10. |> range(start: -1h)
  11. |> filter(fn: (r) =>
  12. r._measurement == "processes" and
  13. r._field == "total"
  14. )
  15. // Join memory used with total processes and calculate
  16. // the average memory (in MB) used for running processes.
  17. join(
  18. tables: {mem:memUsed, proc:procTotal},
  19. on: ["_time", "_stop", "_start", "host"]
  20. )
  21. |> map(fn: (r) => ({
  22. _time: r._time,
  23. _value: (r._value_mem / r._value_proc) / 1000000
  24. })
  25. )

Sort by tags

InfluxQL’s sorting capabilities are very limited, allowing you only to control the sort order of time using the ORDER BY time clause. Flux’s sort() function sorts records based on list of columns. Depending on the column type, records are sorted lexicographically, numerically, or chronologically.

  1. from(bucket:"telegraf/autogen")
  2. |> range(start:-12h)
  3. |> filter(fn: (r) =>
  4. r._measurement == "system" and
  5. r._field == "uptime"
  6. )
  7. |> sort(columns:["region", "host", "_value"])

Group by any column

InfluxQL lets you group by tags or by time intervals, but nothing else. Flux lets you group by any column in the dataset, including _value. Use the Flux group() function to define which columns to group data by.

  1. from(bucket:"telegraf/autogen")
  2. |> range(start:-12h)
  3. |> filter(fn: (r) => r._measurement == "system" and r._field == "uptime" )
  4. |> group(columns:["host", "_value"])

Window by calendar months and years

InfluxQL does not support windowing data by calendar months and years due to their varied lengths. Flux supports calendar month and year duration units (1mo, 1y) and lets you window and aggregate data by calendar month and year.

  1. from(bucket:"telegraf/autogen")
  2. |> range(start:-1y)
  3. |> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent" )
  4. |> aggregateWindow(every: 1mo, fn: mean)

Work with multiple data sources

InfluxQL can only query data stored in InfluxDB. Flux can query data from other data sources such as CSV, PostgreSQL, MySQL, Google BigTable, and more. Join that data with data in InfluxDB to enrich query results.

  1. import "csv"
  2. import "sql"
  3. csvData = csv.from(csv: rawCSV)
  4. sqlData = sql.from(
  5. driverName: "postgres",
  6. dataSourceName: "postgresql://user:password@localhost",
  7. query:"SELECT * FROM example_table"
  8. )
  9. data = from(bucket: "telegraf/autogen")
  10. |> range(start: -24h)
  11. |> filter(fn: (r) => r._measurement == "sensor")
  12. auxData = join(tables: {csv: csvData, sql: sqlData}, on: ["sensor_id"])
  13. enrichedData = join(tables: {data: data, aux: auxData}, on: ["sensor_id"])
  14. enrichedData
  15. |> yield(name: "enriched_data")

For an in-depth walkthrough of querying SQL data, see Query SQL data sources.


DatePart-like queries

InfluxQL doesn’t support DatePart-like queries that only return results during specified hours of the day. The Flux hourSelection function returns only data with time values in a specified hour range.

  1. from(bucket: "telegraf/autogen")
  2. |> range(start: -1h)
  3. |> filter(fn: (r) =>
  4. r._measurement == "cpu" and
  5. r.cpu == "cpu-total"
  6. )
  7. |> hourSelection(start: 9, stop: 17)

Pivot

Pivoting data tables has never been supported in InfluxQL. The Flux pivot() function provides the ability to pivot data tables by specifying rowKey, columnKey, and valueColumn parameters.

  1. from(bucket: "telegraf/autogen")
  2. |> range(start: -1h)
  3. |> filter(fn: (r) =>
  4. r._measurement == "cpu" and
  5. r.cpu == "cpu-total"
  6. )
  7. |> pivot(
  8. rowKey:["_time"],
  9. columnKey: ["_field"],
  10. valueColumn: "_value"
  11. )

Histograms

The ability to generate histograms has been a highly requested feature for InfluxQL, but has never been supported. Flux’s histogram() function uses input data to generate a cumulative histogram with support for other histogram types coming in the future.

  1. from(bucket: "telegraf/autogen")
  2. |> range(start: -1h)
  3. |> filter(fn: (r) =>
  4. r._measurement == "mem" and
  5. r._field == "used_percent"
  6. )
  7. |> histogram(
  8. buckets: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
  9. )

For an example of using Flux to create a cumulative histogram, see Create histograms.


Covariance

Flux provides functions for simple covariance calculation. The covariance() function calculates the covariance between two columns and the cov() function calculates the covariance between two data streams.

Covariance between two columns
  1. from(bucket: "telegraf/autogen")
  2. |> range(start:-5m)
  3. |> covariance(columns: ["x", "y"])
Covariance between two streams of data
  1. table1 = from(bucket: "telegraf/autogen")
  2. |> range(start: -15m)
  3. |> filter(fn: (r) =>
  4. r._measurement == "measurement_1"
  5. )
  6. table2 = from(bucket: "telegraf/autogen")
  7. |> range(start: -15m)
  8. |> filter(fn: (r) =>
  9. r._measurement == "measurement_2"
  10. )
  11. cov(x: table1, y: table2, on: ["_time", "_field"])

Cast booleans to integers

InfluxQL supports type casting, but only for numeric data types (floats to integers and vice versa). Flux type conversion functions provide much broader support for type conversions and let you perform some long-requested operations like casting a boolean values to integers.

Cast boolean field values to integers
  1. from(bucket: "telegraf/autogen")
  2. |> range(start: -1h)
  3. |> filter(fn: (r) =>
  4. r._measurement == "m" and
  5. r._field == "bool_field"
  6. )
  7. |> toInt()

String manipulation and data shaping

InfluxQL doesn’t support string manipulation when querying data. The Flux Strings package is a collection of functions that operate on string data. When combined with the map() function, functions in the string package allow for operations like string sanitization and normalization.

  1. import "strings"
  2. from(bucket: "telegraf/autogen")
  3. |> range(start: -1h)
  4. |> filter(fn: (r) =>
  5. r._measurement == "weather" and
  6. r._field == "temp"
  7. )
  8. |> map(fn: (r) => ({
  9. r with
  10. location: strings.toTitle(v: r.location),
  11. sensor: strings.replaceAll(v: r.sensor, t: " ", u: "-"),
  12. status: strings.substring(v: r.status, start: 0, end: 8)
  13. }))

Work with geo-temporal data

InfluxQL doesn’t provide functionality for working with geo-temporal data. The Flux Geo package is a collection of functions that let you shape, filter, and group geo-temporal data.

  1. import "experimental/geo"
  2. from(bucket: "geo/autogen")
  3. |> range(start: -1w)
  4. |> filter(fn: (r) => r._measurement == "taxi")
  5. |> geo.shapeData(latField: "latitude", lonField: "longitude", level: 20)
  6. |> geo.filterRows(
  7. region: {lat: 40.69335938, lon: -73.30078125, radius: 20.0},
  8. strict: true
  9. )
  10. |> geo.asTracks(groupBy: ["fare-id"])

InfluxQL and Flux parity

Flux is working towards complete parity with InfluxQL and new functions are being added to that end. The table below shows InfluxQL statements, clauses, and functions along with their equivalent Flux functions.

For a complete list of Flux functions, view all Flux functions.

InfluxQL and Flux parity

InfluxQLFlux Functions
SELECTfilter()
WHEREfilter(), range()
GROUP BYgroup()
INTOto() *
ORDER BYsort()
LIMITlimit()
SLIMIT
OFFSET
SOFFSET
SHOW DATABASESbuckets()
SHOW MEASUREMENTSv1.measurements
SHOW FIELD KEYSkeys()
SHOW RETENTION POLICIESbuckets()
SHOW TAG KEYSv1.tagKeys(), v1.measurementTagKeys()
SHOW TAG VALUESv1.tagValues(), v1.measurementTagValues()
SHOW SERIES
CREATE DATABASE
DROP DATABASE
DROP SERIES
DELETE
DROP MEASUREMENT
DROP SHARD
CREATE RETENTION POLICY
ALTER RETENTION POLICY
DROP RETENTION POLICY
COUNTcount()
DISTINCTdistinct()
INTEGRALintegral()
MEANmean()
MEDIANmedian()
MODEmode()
SPREADspread()
STDDEVstddev()
SUMsum()
BOTTOMbottom()
FIRSTfirst()
LASTlast()
MAXmax()
MINmin()
PERCENTILEquantile()
SAMPLEsample()
TOPtop()
ABSmath.abs()
ACOSmath.acos()
ASINmath.asin()
ATANmath.atan()
ATAN2math.atan2()
CEILmath.ceil()
COSmath.cos()
CUMULATIVE_SUMcumulativeSum()
DERIVATIVEderivative()
DIFFERENCEdifference()
ELAPSEDelapsed()
EXPmath.exp()
FLOORmath.floor()
HISTOGRAMhistogram()
LNmath.log()
LOGmath.logb()
LOG2math.log2()
LOG10math.log10()
MOVING_AVERAGEmovingAverage()
NON_NEGATIVE_DERIVATIVEderivative(nonNegative:true)
NON_NEGATIVE_DIFFERENCEdifference(nonNegative:true)
POWmath.pow()
ROUNDmath.round()
SINmath.sin()
SQRTmath.sqrt()
TANmath.tan()
HOLT_WINTERSholtWinters()
CHANDE_MOMENTUM_OSCILLATORchandeMomentumOscillator()
EXPONENTIAL_MOVING_AVERAGEexponentialMovingAverage()
DOUBLE_EXPONENTIAL_MOVING_AVERAGEdoubleEMA()
KAUFMANS_EFFICIENCY_RATIOkaufmansER()
KAUFMANS_ADAPTIVE_MOVING_AVERAGEkaufmansAMA()
TRIPLE_EXPONENTIAL_MOVING_AVERAGEtripleEMA()
TRIPLE_EXPONENTIAL_DERIVATIVEtripleExponentialDerivative()
RELATIVE_STRENGTH_INDEXrelativeStrengthIndex()

\ The to() function only writes to InfluxDB 2.0.*