1、首先使用之前的数据,查找订单信息要求按用户id分组,消费金额进行降序排序。
select *
FROM
( select id, order_money, member_id, create_time, row_number() over(partition by member_id ORDER BY order_money desc) as row_num
from dm_order_2018
) t;
结果:
image.png
2、窗口命名
select *
FROM
( select id,order_money,member_id,create_time,
row_number() over w as row_num
from dm_order_2018
window w as (partition by member_id order by order_money desc)
) t;
查询结果一样
3、窗口滑动示例
select *
FROM
( select id, order_money, member_id, create_time,
avg(order_money) over w as avg_num,
sum(order_money) over w as sum_num
from dm_order_2018
window w as (partition by member_id ORDER BY create_time DESC rows BETWEEN 1 preceding AND 1 following)
) t;
结果:
image.png
从结果可以看出,id为A001订单属于边界值,没有前一行,因此平均订单金额为(100.15+100.15)/2=100.15;
第二行id为A001的订 单前后都有订单,所以平均订单金额为(100.15+100.15+88.15/3=96.15;
以此类推就可以得到一个基于滑动窗口的动态 平均订单值。此例中,窗口函数用到了传统的聚合函数avg(),用来计算动态的平均值。
4、序号函数
查询排序小于等于3的订单
select *
FROM
( select id, order_money, member_id, create_time,
row_number() over(partition by member_id ORDER BY order_money desc) as row_num
from dm_order_2018
) t where row_num <= 3;
结果
image.png
多种排序方式
select *
FROM
( select id, order_money, member_id, create_time,
row_number() over(partition by member_id ORDER BY order_money desc) as row_num1,
rank() over(partition by member_id ORDER BY order_money desc) as row_num2,
dense_rank() over(partition by member_id ORDER BY order_money desc) as row_num3
from dm_order_2018
) t;
结果
image.png
总结:
即rank()会产生序号相同的记录,同时可能产生序号间隙;而dense_rank()也会产生序号相同的记录,但不会产生序号间隙。
5、分布函数
select *
FROM
( select id, order_money, member_id, create_time,
row_number() over w as row_num,
percent_rank() over w as per
from dm_order_2018
window w as (partition by member_id ORDER BY create_time DESC)
) t;
结果
image.png
select *
FROM
( select id, order_money, member_id, create_time,
row_number() over w as row_num,
cume_dist() over w as percent
from dm_order_2018
window w as (partition by member_id ORDER BY create_time DESC)
) t;
结果
image.png
总结:
CUME_DIST 小于等于当前值的行数/分组内总行数。
PERCENT_RANK 分组内当前行的RANK值-1/分组内总行数-1。