GROUPING_ID

Name

GROUPING_ID

Description

这是一个用来计算分组级别的函数。当 SQL 语句中使用了 GROUP BY 子句时,GROUPING_ID 函数可以在 SELECT <select> listHAVINGORDER BY 子句中使用。

Syntax

  1. GROUPING_ID ( <column_expression>[ ,...n ] )

Arguments

<column_expression>

是在 GROUP BY 子句中包含的列或表达式。

Return Type

BIGINT

Remarks

GROUPING_ID 函数的入参 <column_expression> 必须和 GROUP BY 子句的表达式一致。比如说,如果你按 user_id 进行 GROUP BY,那么你的 GROUPING_ID 函数应该这么写:GROUPING_ID (user_id)。再比如说,你按 name 进行 GROUP BY,那么函数应该这么写:GROUPING_ID (name)

Comparing GROUPING_ID() to GROUPING()

GROUPING_ID(<column_expression> [ ,...n ]) 的计算规则为,对于输入的字段(或表达式)列表,分别对每个字段(或表达式)进行 GROUPING(<column_expression>) 运算,得到的结果组成一个 01 串。这个 01 串实际上是二进制数,GROUPING_ID 函数会将其转化为十进制数返回。比如说,以 SELECT a, b, c, SUM(d), GROUPING_ID(a,b,c) FROM T GROUP BY <group by list> 语句为例,下面展示了 GROUPING_ID() 函数的输入和输出。

Columns aggregatedGROUPING_ID (a, b, c) input = GROUPING(a) + GROUPING(b) + GROUPING(c)GROUPING_ID () output
a1004
b0102
c0011
ab1106
ac1015
bc0113
abc1117

Technical Definition of GROUPING_ID()

GROUPING_ID 函数的入参必须是 GROUP BY 子句中的字段(或字段表达式)。GROUPING_ID() 函数返回一个整数位图,位图中的每一位均与 GROUP BY 子句中的字段(或字段表达式)一一对应,位图中的最低位代表第 N 个参数,第二低位代表第 N-1 个参数,以此类推。当某一位被置为 1 时,表示其对应的列不参与分组聚合。

GROUPING_ID() Equivalents

对于多个字段(或字段表达式)进行分组查询时,以下两个声明是等价的:

声明 A:

  1. SELECT GROUPING_ID(A,B)
  2. FROM T
  3. GROUP BY CUBE(A,B)

声明 B:

  1. SELECT 3 FROM T GROUP BY ()
  2. UNION ALL
  3. SELECT 1 FROM T GROUP BY A
  4. UNION ALL
  5. SELECT 2 FROM T GROUP BY B
  6. UNION ALL
  7. SELECT 0 FROM T GROUP BY A,B

对于只对一个字段(或字段表达式)进行分组查询,GROUPING (<column_expression>)GROUPING_ID(<column_expression>) 是等价对。

Example

在开始我们的例子之前,我们先准备好以下数据:

  1. CREATE TABLE employee (
  2. uid INT,
  3. name VARCHAR(32),
  4. level VARCHAR(32),
  5. title VARCHAR(32),
  6. department VARCHAR(32),
  7. hiredate DATE
  8. )
  9. UNIQUE KEY(uid)
  10. DISTRIBUTED BY HASH(uid) BUCKETS 1
  11. PROPERTIES (
  12. "replication_num" = "1"
  13. );
  14. INSERT INTO employee VALUES
  15. (1, 'Abby', 'Senior', 'President', 'Board of Directors', '1999-11-13'),
  16. (2, 'Bob', 'Senior', 'Vice-President', 'Board of Directors', '1999-11-13'),
  17. (3, 'Candy', 'Senior', 'System Engineer', 'Technology', '2005-3-7'),
  18. (4, 'Devere', 'Senior', 'Hardware Engineer', 'Technology', '2006-7-9'),
  19. (5, 'Emilie', 'Senior', 'System Analyst', 'Technology', '2003-8-28'),
  20. (6, 'Fredrick', 'Senior', 'Sales Manager', 'Sales', '2004-9-7'),
  21. (7, 'Gitel', 'Assistant', 'Business Executive', 'Sales', '2003-3-19'),
  22. (8, 'Haden', 'Trainee', 'Sales Assistant', 'Sales', '2007-6-30'),
  23. (9, 'Irene', 'Assistant', 'Business Executive', 'Sales', '2005-10-20'),
  24. (10, 'Jankin', 'Senior', 'Marketing Supervisor', 'Marketing', '2001-4-13'),
  25. (11, 'Louis', 'Trainee', 'Marketing Assistant', 'Marketing', '2007-8-2'),
  26. (12, 'Martin', 'Trainee', 'Marketing Assistant', 'Marketing', '2007-7-1'),
  27. (13, 'Nasir', 'Assistant', 'Marketing Executive', 'Marketing', '2004-9-3');

结果如下:

  1. +------+----------+-----------+----------------------+--------------------+------------+
  2. | uid | name | level | title | department | hiredate |
  3. +------+----------+-----------+----------------------+--------------------+------------+
  4. | 1 | Abby | Senior | President | Board of Directors | 1999-11-13 |
  5. | 2 | Bob | Senior | Vice-President | Board of Directors | 1999-11-13 |
  6. | 3 | Candy | Senior | System Engineer | Technology | 2005-03-07 |
  7. | 4 | Devere | Senior | Hardware Engineer | Technology | 2006-07-09 |
  8. | 5 | Emilie | Senior | System Analyst | Technology | 2003-08-28 |
  9. | 6 | Fredrick | Senior | Sales Manager | Sales | 2004-09-07 |
  10. | 7 | Gitel | Assistant | Business Executive | Sales | 2003-03-19 |
  11. | 8 | Haden | Trainee | Sales Assistant | Sales | 2007-06-30 |
  12. | 9 | Irene | Assistant | Business Executive | Sales | 2005-10-20 |
  13. | 10 | Jankin | Senior | Marketing Supervisor | Marketing | 2001-04-13 |
  14. | 11 | Louis | Trainee | Marketing Assistant | Marketing | 2007-08-02 |
  15. | 12 | Martin | Trainee | Marketing Assistant | Marketing | 2007-07-01 |
  16. | 13 | Nasir | Assistant | Marketing Executive | Marketing | 2004-09-03 |
  17. +------+----------+-----------+----------------------+--------------------+------------+
  18. 13 rows in set (0.01 sec)

A. Using GROUPING_ID to identify grouping levels

下面的例子按部门和职级统计雇员的人数。GROUPING_ID() 函数被用来计算每一行的聚合程度,其结果放在 Job Title 这一列上。

  1. SELECT
  2. department,
  3. CASE
  4. WHEN GROUPING_ID(department, level) = 0 THEN level
  5. WHEN GROUPING_ID(department, level) = 1 THEN CONCAT('Total: ', department)
  6. WHEN GROUPING_ID(department, level) = 3 THEN 'Total: Company'
  7. ELSE 'Unknown'
  8. END AS 'Job Title',
  9. COUNT(uid) AS 'Employee Count'
  10. FROM employee
  11. GROUP BY ROLLUP(department, level)
  12. ORDER BY GROUPING_ID(department, level) ASC;

结果如下:

  1. +--------------------+---------------------------+----------------+
  2. | department | Job Title | Employee Count |
  3. +--------------------+---------------------------+----------------+
  4. | Board of Directors | Senior | 2 |
  5. | Technology | Senior | 3 |
  6. | Sales | Senior | 1 |
  7. | Sales | Assistant | 2 |
  8. | Sales | Trainee | 1 |
  9. | Marketing | Senior | 1 |
  10. | Marketing | Trainee | 2 |
  11. | Marketing | Assistant | 1 |
  12. | Board of Directors | Total: Board of Directors | 2 |
  13. | Technology | Total: Technology | 3 |
  14. | Sales | Total: Sales | 4 |
  15. | Marketing | Total: Marketing | 4 |
  16. | NULL | Total: Company | 13 |
  17. +--------------------+---------------------------+----------------+
  18. 13 rows in set (0.01 sec)

B. Using GROUPING_ID to filter a result set

在下面的代码中,将返回部门中的高级人员的行。

  1. SELECT
  2. department,
  3. CASE
  4. WHEN GROUPING_ID(department, level) = 0 THEN level
  5. WHEN GROUPING_ID(department, level) = 1 THEN CONCAT('Total: ', department)
  6. WHEN GROUPING_ID(department, level) = 3 THEN 'Total: Company'
  7. ELSE 'Unknown'
  8. END AS 'Job Title',
  9. COUNT(uid)
  10. FROM employee
  11. GROUP BY ROLLUP(department, level)
  12. HAVING `Job Title` IN ('Senior');

结果如下:

  1. +--------------------+-----------+--------------+
  2. | department | Job Title | count(`uid`) |
  3. +--------------------+-----------+--------------+
  4. | Board of Directors | Senior | 2 |
  5. | Technology | Senior | 3 |
  6. | Sales | Senior | 1 |
  7. | Marketing | Senior | 1 |
  8. +--------------------+-----------+--------------+
  9. 5 rows in set (0.01 sec)

Keywords

GROUPING_ID

Best Practice

更多信息可以参考: