### 分析函数（窗口函数）

#### 分析函数介绍

function(args) OVER(partition_by_clause order_by_clause [window_clause])partition_by_clause ::= PARTITION BY expr [, expr ...]order_by_clause ::= ORDER BY expr [ASC | DESC] [, expr [ASC | DESC] ...]

window_clause: 见后面Window Clause)

Function

PARTITION BY从句

Partition By从句和Group By类似。它把输入行按照指定的一列或多列分组，相同值的行会被分到一组。

ORDER BY从句

Order By从句和外层的Order By基本一致。它定义了输入行的排列顺序，如果指定了Partition By，则Order By定义了每个Partition分组内的顺序。与外层Order By的唯一不同点是，OVER从句中的Order By n（n是正整数）相当于不做任何操作，而外层的Order By n表示按照第n列排序。

SELECTrow_number() OVER (ORDER BY date_and_time) AS id,c1, c2, c3, c4FROM events;

Window从句

Window从句用来为分析函数指定一个运算范围，以当前行为准，前后若干行作为分析函数运算的对象。Window从句支持的方法有：AVG(), COUNT(), FIRST_VALUE(), LAST_VALUE()和SUM()。对于 MAX()和MIN(), window从句可以指定开始范围UNBOUNDED PRECEDING

ROWS BETWEEN [ { m | UNBOUNDED } PRECEDING | CURRENT ROW] [ AND [CURRENT ROW | { UNBOUNDED | n } FOLLOWING] ]

  create table stock_ticker (stock_symbol string, closing_price decimal(8,2), closing_date timestamp);  ...load some data...  select * from stock_ticker order by stock_symbol, closing_date| stock_symbol | closing_price | closing_date ||--------------|---------------|---------------------|| JDR | 12.86 | 2014-10-02 00:00:00 || JDR | 12.89 | 2014-10-03 00:00:00 || JDR | 12.94 | 2014-10-04 00:00:00 || JDR | 12.55 | 2014-10-05 00:00:00 || JDR | 14.03 | 2014-10-06 00:00:00 || JDR | 14.75 | 2014-10-07 00:00:00 || JDR | 13.98 | 2014-10-08 00:00:00 |

  select stock_symbol, closing_date, closing_price,  avg(closing_price) over (partition by stock_symbol order by closing_date  rows between 1 preceding and 1 following) as moving_average  from stock_ticker;| stock_symbol | closing_date | closing_price | moving_average ||--------------|---------------------|---------------|----------------|| JDR | 2014-10-02 00:00:00 | 12.86 | 12.87 || JDR | 2014-10-03 00:00:00 | 12.89 | 12.89 || JDR | 2014-10-04 00:00:00 | 12.94 | 12.79 || JDR | 2014-10-05 00:00:00 | 12.55 | 13.17 || JDR | 2014-10-06 00:00:00 | 14.03 | 13.77 || JDR | 2014-10-07 00:00:00 | 14.75 | 14.25 || JDR | 2014-10-08 00:00:00 | 13.98 | 14.36 |

#### Function使用举例

##### AVG()

AVG([DISTINCT | ALL] expression) [OVER (analytic_clause)]

  select x, property,  avg(x) over  (  partition by property  order by x  rows between 1 preceding and 1 following  ) as 'moving average'  from int_t where property in ('odd','even');| x | property | moving average ||----|----------|----------------|| 2 | even | 3 || 4 | even | 4 || 6 | even | 6 || 8 | even | 8 || 10 | even | 9 || 1 | odd | 2 || 3 | odd | 3 || 5 | odd | 5 || 7 | odd | 7 || 9 | odd | 8 |
##### COUNT()

COUNT([DISTINCT | ALL] expression) [OVER (analytic_clause)]

  select x, property,  count(x) over  (  partition by property  order by x  rows between unbounded preceding and current row  ) as 'cumulative total'  from int_t where property in ('odd','even');| x | property | cumulative count ||----|----------|------------------|| 2 | even | 1 || 4 | even | 2 || 6 | even | 3 || 8 | even | 4 || 10 | even | 5 || 1 | odd | 1 || 3 | odd | 2 || 5 | odd | 3 || 7 | odd | 4 || 9 | odd | 5 |
##### DENSE_RANK()

DENSE_RANK()函数用来表示排名，与RANK()不同的是，DENSE_RANK()不会出现空缺数字。比如，如果出现了两个并列的1，DENSE_RANK()的第三个数仍然是2，而RANK()的第三个数是3。

DENSE_RANK() OVER(partition_by_clause order_by_clause)

   select x, y, dense_rank() over(partition by x order by y) as rank from int_t;| x | y | rank ||----|------|----------|| 1 | 1 | 1 || 1 | 2 | 2 || 1 | 2 | 2 || 2 | 1 | 1 || 2 | 2 | 2 || 2 | 3 | 3 || 3 | 1 | 1 || 3 | 1 | 1 || 3 | 2 | 2 |
##### FIRST_VALUE()

FIRST_VALUE()返回窗口范围内的第一个值。

FIRST_VALUE(expr) OVER(partition_by_clause order_by_clause [window_clause])

   select name, country, greeting from mail_merge;| name | country | greeting ||---------|---------|--------------|| Pete | USA | Hello || John | USA | Hi || Boris | Germany | Guten tag || Michael | Germany | Guten morgen || Bjorn | Sweden | Hej || Mats | Sweden | Tja |

   select country, name,   first_value(greeting)   over (partition by country order by name, greeting) as greeting from mail_merge;| country | name | greeting ||---------|---------|-----------|| Germany | Boris | Guten tag || Germany | Michael | Guten tag || Sweden | Bjorn | Hej || Sweden | Mats | Hej || USA | John | Hi || USA | Pete | Hi |
##### LAG()

LAG()方法用来计算当前行向前数若干行的值。

LAG (expr, offset, default) OVER (partition_by_clause order_by_clause)

   select stock_symbol, closing_date, closing_price,   lag(closing_price,1, 0) over (partition by stock_symbol order by closing_date) as "yesterday closing"   from stock_ticker   order by closing_date;| stock_symbol | closing_date | closing_price | yesterday closing ||--------------|---------------------|---------------|-------------------|| JDR | 2014-09-13 00:00:00 | 12.86 | 0 || JDR | 2014-09-14 00:00:00 | 12.89 | 12.86 || JDR | 2014-09-15 00:00:00 | 12.94 | 12.89 || JDR | 2014-09-16 00:00:00 | 12.55 | 12.94 || JDR | 2014-09-17 00:00:00 | 14.03 | 12.55 || JDR | 2014-09-18 00:00:00 | 14.75 | 14.03 || JDR | 2014-09-19 00:00:00 | 13.98 | 14.75
##### LAST_VALUE()

LAST_VALUE()返回窗口范围内的最后一个值。与FIRST_VALUE()相反。

LAST_VALUE(expr) OVER(partition_by_clause order_by_clause [window_clause])

   select country, name,   last_value(greeting)   over (partition by country order by name, greeting) as greeting   from mail_merge;| country | name | greeting ||---------|---------|--------------|| Germany | Boris | Guten morgen || Germany | Michael | Guten morgen || Sweden | Bjorn | Tja || Sweden | Mats | Tja || USA | John | Hello || USA | Pete | Hello

LEAD (expr, offset, default]) OVER (partition_by_clause order_by_clause)

   select stock_symbol, closing_date, closing_price,   case   (lead(closing_price,1, 0)   over (partition by stock_symbol order by closing_date)-closing_price) > 0   when true then "higher"   when false then "flat or lower"   end as "trending"   from stock_ticker   order by closing_date;| stock_symbol | closing_date | closing_price | trending ||--------------|---------------------|---------------|---------------|| JDR | 2014-09-13 00:00:00 | 12.86 | higher || JDR | 2014-09-14 00:00:00 | 12.89 | higher || JDR | 2014-09-15 00:00:00 | 12.94 | flat or lower || JDR | 2014-09-16 00:00:00 | 12.55 | higher || JDR | 2014-09-17 00:00:00 | 14.03 | higher || JDR | 2014-09-18 00:00:00 | 14.75 | flat or lower || JDR | 2014-09-19 00:00:00 | 13.98 | flat or lower |
##### MAX()

MAX([DISTINCT | ALL] expression) [OVER (analytic_clause)]

   select x, property,   max(x) over   (   order by property, x   rows between unbounded preceding and 1 following   ) as 'local maximum'   from int_t where property in ('prime','square');| x | property | local maximum ||---|----------|---------------|| 2 | prime | 3 || 3 | prime | 5 || 5 | prime | 7 || 7 | prime | 7 || 1 | square | 7 || 4 | square | 9 || 9 | square | 9 |
##### MIN()

MIN([DISTINCT | ALL] expression) [OVER (analytic_clause)]

   select x, property,   min(x) over   (   order by property, x desc   rows between unbounded preceding and 1 following   ) as 'local minimum'   from int_t where property in ('prime','square');| x | property | local minimum ||---|----------|---------------|| 7 | prime | 5 || 5 | prime | 3 || 3 | prime | 2 || 2 | prime | 2 || 9 | square | 2 || 4 | square | 1 || 1 | square | 1 |
##### RANK()

RANK()函数用来表示排名，与DENSE_RANK()不同的是，RANK()会出现空缺数字。比如，如果出现了两个并列的1， RANK()的第三个数就是3，而不是2。

RANK() OVER(partition_by_clause order_by_clause)

   select x, y, rank() over(partition by x order by y) as rank from int_t;| x | y | rank ||----|------|----------|| 1 | 1 | 1 || 1 | 2 | 2 || 1 | 2 | 2 || 2 | 1 | 1 || 2 | 2 | 2 || 2 | 3 | 3 || 3 | 1 | 1 || 3 | 1 | 1 || 3 | 2 | 3 |
##### ROW_NUMBER()

ROW_NUMBER() OVER(partition_by_clause order_by_clause)

   select x, y, row_number() over(partition by x order by y) as rank from int_t;| x | y | rank ||---|------|----------|| 1 | 1 | 1 || 1 | 2 | 2 || 1 | 2 | 3 || 2 | 1 | 1 || 2 | 2 | 2 || 2 | 3 | 3 || 3 | 1 | 1 || 3 | 1 | 2 || 3 | 2 | 3 |
##### SUM()

SUM([DISTINCT | ALL] expression) [OVER (analytic_clause)]

   select x, property,   sum(x) over   (   partition by property   order by x   rows between 1 preceding and 1 following   ) as 'moving total'   from int_t where property in ('odd','even');| x | property | moving total ||----|----------|--------------|| 2 | even | 6 || 4 | even | 12 || 6 | even | 18 || 8 | even | 24 || 10 | even | 18 || 1 | odd | 4 || 3 | odd | 9 || 5 | odd | 15 || 7 | odd | 21 || 9 | odd | 16 |