AGG_STATE

description

  1. AGG_STATE cannot be used as a key column, and the signature of the aggregation function must be declared at the same time when creating the table.
  2. User does not need to specify length and default value. The actual stored data size is related to the function implementation.

AGG_STATE can only be used with state /merge/union function combiner usage.

It should be noted that the signature of the aggregation function is also part of the type, and agg_state with different signatures cannot be mixed. For example, if the signature of the table creation statement is max_by(int,int), then max_by(bigint,int) or group_concat(varchar) cannot be inserted. The nullable attribute here is also part of the signature. If you can confirm that you will not enter a null value, you can declare the parameter as not null, which can obtain a smaller storage size and reduce serialization/deserialization overhead.

example

Create table example:

  1. create table a_table(
  2. k1 int null,
  3. k2 agg_state max_by(int not null,int),
  4. k3 agg_state group_concat(string)
  5. )
  6. aggregate key (k1)
  7. distributed BY hash(k1) buckets 3
  8. properties("replication_num" = "1");

Here k2 and k3 use max_by and group_concat as aggregation types respectively.

Insert data example:

  1. insert into a_table values(1,max_by_state(3,1),group_concat_state('a'));
  2. insert into a_table values(1,max_by_state(2,2),group_concat_state('bb'));
  3. insert into a_table values(2,max_by_state(1,3),group_concat_state('ccc'));

For the agg_state column, the insert statement must use the state function to generate the corresponding agg_state data, where the functions and input parameter types must completely correspond to agg_state.

Select data example:

  1. mysql [test]>select k1,max_by_merge(k2),group_concat_merge(k3) from a_table group by k1 order by k1;
  2. +------+--------------------+--------------------------+
  3. | k1 | max_by_merge(`k2`) | group_concat_merge(`k3`) |
  4. +------+--------------------+--------------------------+
  5. | 1 | 2 | bb,a |
  6. | 2 | 1 | ccc |
  7. +------+--------------------+--------------------------+

If you need to get the actual result, you need to use the corresponding merge function.

  1. mysql [test]>select max_by_merge(u2),group_concat_merge(u3) from (
  2. select k1,max_by_union(k2) as u2,group_concat_union(k3) u3 from a_table group by k1 order by k1
  3. ) t;
  4. +--------------------+--------------------------+
  5. | max_by_merge(`u2`) | group_concat_merge(`u3`) |
  6. +--------------------+--------------------------+
  7. | 1 | ccc,bb,a |
  8. +--------------------+--------------------------+

If you want to aggregate only the agg_state without getting the actual result during the process, you can use the union function.

For more examples, see datatype_p0/agg_state

keywords

  1. AGG_STATE