第八章_函数【窗口函数】

1. 开窗函数的作用
说明 : 开窗函数就是 根据指定的开窗规则 为表的每条记录,标记状态,不会增加或者减少表的记录数
2. 语法
分析函数、聚合函数 over(partition by 列名 order by 列名 rows between 开始位置 and 结束位置)
3. 分区 : partition by 列名
说明 : 按照指定字段分区,可以为多个字段,相当于 group by
不指定分区时,操作的是全表

4. 排序 : order by 列名
说明 : 按照指定字段,进行排序
如果指定分区,则是分区内排序,不指定分区,则是全表排序
5. 开窗范围 : rows between 开始位置 and 结束位置
说明 : 计算的范围
注意 : 不指定时,默认范围: 第一行到当前行

位置参数 :
current row:当前行
n preceding:往前 n 行数据
n following:往后 n 行数据
unbounded:起点,
unbounded preceding 表示该窗口最前面的行(起点)
unbounded following 表示该窗口最后面的行(终点)

示例 :
rows between unbounded preceding and current row(表示从起点到当前行)
rows between 2 preceding and 1 following(表示往前2行到往后1行)
rows between 2 preceding and 1 current row(表示往前2行到当前行)
rows between current row and unbounded following(表示当前行到终点)
第八章_函数【窗口函数】
with t1 as (
select 'huawei' as channel ,'2021-08-01' as credit_date,5 as score
union all
select 'huawei' as channel ,'2021-08-01' as credit_date,5 as score
union all
select 'huawei' as channel ,'2021-08-03' as credit_date,12 as score
union all
select 'huawei' as channel ,'2021-08-04' as credit_date,25 as score
union all
select 'huawei' as channel ,'2021-08-02' as credit_date,1 as score
union all
select 'huawei' as channel ,'2021-08-07' as credit_date,7 as score
union all
select 'huawei' as channel ,'2021-08-08' as credit_date,10 as score
union all
select 'huawei' as channel ,'2021-08-06' as credit_date,33 as score
union all
select 'vivo' as channel ,'2021-09-01' as credit_date,1 as score
union all
select 'vivo' as channel ,'2021-09-02' as credit_date,2 as score
union all
select 'vivo' as channel ,'2021-09-04' as credit_date,5 as score
union all
select 'vivo' as channel ,'2021-09-09' as credit_date,9 as score
union all
select 'vivo' as channel ,'2021-09-07' as credit_date,77 as score
union all
select 'vivo' as channel ,'2021-09-08' as credit_date,10 as score
union all
select 'vivo' as channel ,'2021-09-11' as credit_date,3 as score
)
select *
,sum(t1.score) over(partition by t1.channel
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) as sum1 -- 开窗范围:首行~当前行(默认参数为此参数)
,sum(t1.score) over(partition by t1.channel
    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
    ) as sum2 -- 开窗范围:当前行~末尾行
from t1
;
channel credit_date   score     sum1    sum2
huawei  2021-08-01      5       5       98
huawei  2021-08-06      33      38      93
huawei  2021-08-08      10      48      60
huawei  2021-08-07      7       55      50
huawei  2021-08-02      1       56      43
huawei  2021-08-04      25      81      42
huawei  2021-08-03      12      93      17
huawei  2021-08-01      5       98      5
vivo    2021-09-08      10      10      107
vivo    2021-09-07      77      87      97
vivo    2021-09-09      9       96      20
vivo    2021-09-04      5       101     11
vivo    2021-09-02      2       103     6
vivo    2021-09-01      1       104     4
vivo    2021-09-11      3       107     3
View Code

6. 聚合函数
count、sum、min、max、avg
7. 分析函数
rank : 跳跃排序 当排序字段值相同时, 自增序号重复,且会跳过下一个序号 示例(1,2,3,4,4,6)
row_number : 连续排序 当排序字段值相同时, 自增序号不会重复 示例(1,2,3,4,5,6)
dense_rank : 密集排序 当排序字段值相同时, 自增序号重复,且不会跳过下一个序号 示例(1,2,3,4,4,5)
cume_dist : <= 当前值的记录数据 / 分组内总记录数
percent_rank : 分组内当前行的RANK值-1/分组内总行数-1
ntile(x) : 将窗口内数据进行切片(根据序号),并返回该记录的切片编号
当切片不均时,默认增加第一个切片的分布
-- 7.1 排序函数示例
-- rank、dense_rank、row_number
第八章_函数【窗口函数】
-- 排序函数示例
-- rank、dense_rank、row_number
with t1 as (
select 'huawei' as channel ,'2021-08-01' as credit_date,5 as score
union all
select 'huawei' as channel ,'2021-08-01' as credit_date,5 as score
union all
select 'huawei' as channel ,'2021-08-03' as credit_date,12 as score
union all
select 'huawei' as channel ,'2021-08-04' as credit_date,25 as score
union all
select 'huawei' as channel ,'2021-08-02' as credit_date,1 as score
union all
select 'huawei' as channel ,'2021-08-07' as credit_date,7 as score
union all
select 'huawei' as channel ,'2021-08-08' as credit_date,10 as score
union all
select 'huawei' as channel ,'2021-08-06' as credit_date,33 as score
union all
select 'vivo' as channel ,'2021-09-01' as credit_date,1 as score
union all
select 'vivo' as channel ,'2021-09-02' as credit_date,2 as score
union all
select 'vivo' as channel ,'2021-09-04' as credit_date,5 as score
union all
select 'vivo' as channel ,'2021-09-09' as credit_date,9 as score
union all
select 'vivo' as channel ,'2021-09-07' as credit_date,77 as score
union all
select 'vivo' as channel ,'2021-09-08' as credit_date,10 as score
union all
select 'vivo' as channel ,'2021-09-11' as credit_date,3 as score
)
select *
,rank() over(partition by t1.channel order by t1.score) as rank
,dense_rank() over(partition by t1.channel order by t1.score) as dense_rank
,row_number() over(partition by t1.channel order by t1.score) as row_number

from t1;
channel credit_date   score    rank dense_rank      row_number
huawei  2021-08-02      1       1       1           1
huawei  2021-08-01      5       2       2           2
huawei  2021-08-01      5       2       2           3
huawei  2021-08-07      7       4       3           4
huawei  2021-08-08      10      5       4           5
huawei  2021-08-03      12      6       5           6
huawei  2021-08-04      25      7       6           7
huawei  2021-08-06      33      8       7           8
vivo    2021-09-01      1       1       1           1
vivo    2021-09-02      2       2       2           2
vivo    2021-09-11      3       3       3           3
vivo    2021-09-04      5       4       4           4
vivo    2021-09-09      9       5       5           5
vivo    2021-09-08      10      6       6           6
vivo    2021-09-07      77      7       7           7
View Code

-- 7.2 统计当前记录组内占比
-- cume_dist
















上一篇:中国神经酸市场预测与投资战略报告(2022版)


下一篇:中国中性粒细胞减少治疗市场趋势报告、技术动态创新及市场预测