geo.filterRows() function
The geo.filterRows()
function is experimental and subject to change at any time. By using this function, you accept the risks of experimental functions.
The geo.filterRows()
function filters data by a specified geographic region with the option of strict filtering. This function is a combination of geo.gridFilter()
and geo.strictFilter()
.
*Function type: Transformation*
import "experimental/geo"
geo.filterRows(
region: {lat: 40.69335938, lon: -73.30078125, radius: 20.0},
minSize: 24,
maxSize: -1,
level: -1,
s2cellIDLevel: -1,
correlationKey: ["_time"],
strict: true
)
s2_cell_id must be part of the group key
To filter geo-temporal data with geo.filterRows()
, s2_cell_id
must be part of the group key. To add s2_cell_id
to the group key, use experimental.group
:
import "experimental"
// ...
|> experimental.group(columns: ["s2_cell_id"], mode: "extend")
Strict and non-strict filtering
In most cases, the specified geographic region does not perfectly align with S2 grid cells.
- Non-strict filtering returns points that may be outside of the specified region but inside S2 grid cells partially covered by the region.
- Strict filtering returns only points inside the specified region.
S2 grid cell
Filter region
Returned point
Strict filtering
Non-strict filtering
Parameters
region
The region containing the desired data points. Specify record properties for the shape. See Region definitions.
*Data type: Record*
minSize
Minimum number of cells that cover the specified region. Default is 24
.
*Data type: Integer*
maxSize
Maximum number of cells that cover the specified region. Default is -1
.
*Data type: Integer*
level
S2 cell level of grid cells. Default is -1
.
*Data type: Integer*
level
is mutually exclusive with minSize
and maxSize
and must be less than or equal to s2cellIDLevel
.
s2cellIDLevel
S2 Cell level used in s2_cell_id
tag. Default is -1
.
*Data type: Integer*
When set to -1
, geo.filterRows()
attempts to automatically detect the S2 Cell ID level.
correlationKey
List of columns used to uniquely identify a row for output. Default is ["_time"]
.
*Data type: Array of strings*
strict
Enable strict geographic data filtering which filters points by longitude (lon
) and latitude (lat
). For S2 grid cells that are partially covered by the defined region, only points with coordinates in the defined region are returned. Default is true
. See Strict and non-strict filtering above.
*Data type: Boolean*
Examples
Strictly filter data in a box-shaped region
import "experimental/geo"
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement")
|> geo.filterRows(
region: {
minLat: 40.51757813,
maxLat: 40.86914063,
minLon: -73.65234375,
maxLon: -72.94921875
}
)
Approximately filter data in a circular region
The following example returns points with coordinates located in S2 grid cells partially covered by the defined region even though some points my be located outside of the region.
import "experimental/geo"
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement")
|> geo.filterRows(
region: {
lat: 40.69335938,
lon: -73.30078125,
radius: 20.0
}
strict: false
)
Filter data in a polygonal region
import "experimental/geo"
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement")
|> geo.filterRows(
region: {
points: [
{lat: 40.671659, lon: -73.936631},
{lat: 40.706543, lon: -73.749177},
{lat: 40.791333, lon: -73.880327}
]
}
)
Function definition
filterRows = (
tables=<-,
region,
minSize=24,
maxSize=-1,
level=-1,
s2cellIDLevel=-1,
strict=true
) => {
_columns =
|> columns(column: "_value")
|> tableFind(fn: (key) => true )
|> getColumn(column: "_value")
_rows =
if contains(value: "lat", set: _columns) then
tables
|> gridFilter(
region: region,
minSize: minSize,
maxSize: maxSize,
level: level,
s2cellIDLevel: s2cellIDLevel)
else
tables
|> gridFilter(
region: region,
minSize: minSize,
maxSize: maxSize,
level: level,
s2cellIDLevel: s2cellIDLevel)
|> toRows()
_result =
if strict then
_rows
|> strictFilter(region)
else
_rows
return _result
}