DISABLE_STREAMING_PREAGGREGATIONS Query Option (Impala 2.5 or higher only)
Turns off the “streaming preaggregation” optimization that is available in Impala 2.5 and higher. This optimization reduces unnecessary work performed by queries that perform aggregation operations on columns with few or no duplicate values, for example DISTINCT id_column
or GROUP BY unique_column
. If the optimization causes regressions in existing queries that use aggregation functions, you can turn it off as needed by setting this query option.
Type: Boolean; recognized values are 1 and 0, or true
and false
; any other value interpreted as false
Default: false
(shown as 0 in output of SET
statement)
Note: In Impala 2.5.0, only the value 1 enables the option, and the value true
is not recognized. This limitation is tracked by the issue IMPALA-3334, which shows the releases where the problem is fixed.
Usage notes:
Typically, queries that would require enabling this option involve very large numbers of aggregated values, such as a billion or more distinct keys being processed on each worker node.
Added in: Impala 2.5.0
Parent topic: Query Options for the SET Statement