MySQL开窗函数

一，开窗函数：

开窗函数：为将要被操作的行的集合定义一个窗口，它对一组值进行操作，不需要使用GROUP BY子句对数据进行分组，能够在同一行中同时返回基础行的列和聚合列。

开窗函数又称OLAP函数（Online Analytical Processing），MySql在8.0的版本(2016年)增加了对开窗函数的支持。

注：在Oracle中称为分析函数。在MySQL中称为开窗函数。

二，开窗函数分类

1，开窗函数大体分两类：

a，聚合开窗函数：（sum,avg,count,max,min）
聚合函数()+over() ，聚合之后返回多行。
b，专用开窗函数：(Rank,Dense_Rank,Row_Number 等)

注意：聚合开窗函数只能使用PARTITION BY子句，ORDER BY不能与聚合开窗函数一同使用。

2，专用开窗函数(按功能划分)

-- 序号函数/排名函数：
ROW_NUMBER()                排序：1，2，3
RANK()                               排序：1，1，3
DENSE_RANK()                    排序：1，1，2

-- 分布函数：
PERCENT_RANK()              (rank-1)/(rows-1)
CUME_DIST()                     <=当前rank值的函数/总函数

-- 前后函数：
LAG(expr,n)                        返回当前行的前n行的expr值；
LEAD(expr,n)                        返回当前行的后n行的expr值；

-- 头尾函数：
FIRST_VALUE(expr)             返回第一个expr值；
LAST_VALUE(expr)                返回最后个expr值；

-- 其他函数：
NTH_VALU(expr,n)            返回第n个expr值；
NTILE(n)                                将有序数据分为n个桶，记录等级数。返回每一行在第几桶。

3，窗口函数和普通聚合函数的区别

普通的聚合函数，聚合的行集是组，开窗函数聚合的行集是窗口。因此，普通的聚合函数每组(Group by)只返回一个值，而开窗函数则可为窗口中的每行都返回一个值。简单理解，就是对查询的结果多出一列，这一列可以是聚合值，也可以是排序值。

窗口函数和普通聚合函数的区别:

1,聚合函数是将多条记录聚合为一条；窗口函数是每条记录都会执行，有几条记录执行完还是几条。
2,聚合函数也可以用于窗口函数。

4，row_number,cume_dist,percent_rank 的区别

row_number()：显示行号

cume_dist：计算某个值在一组有序的数据中累计的分布

 计算结果为相对位置/总行数,返回值为(0,1]
注意：对于重复值，计算的时候，取重复值的最后一行的位置

percent_rank：和cume_dist 的不同点在于计算分布结果的方法
计算方法为（相对位置-1）/（总行数-1）
```
注意：对于重复值，计算的时候，取重复值的第一行的位
```

三，开窗函数的语法结构

# Key word :Partiton by & order by
<开窗函数> over ([PARTITION by <列清单>]
                     Order by <排序用列清单>）
# 注意：[]中的内容可以不用写。

四，应用举例

1,准备数据：

创建表 student_scores：
mysql> create table student_scores(
    -> id int,
    -> studentId int,
    -> language int,
    -> math int,
    -> english int,
    -> classId varchar(30),
    -> departmentId varchar(30)
    -> );
    
    插入数据：
    mysql> insert into student_scores values 
    ->   (1,111,68,69,90,'class1','department1'),
    ->   (2,112,73,80,96,'class1','department1'),
    ->   (3,113,90,74,75,'class1','department1'),
    ->   (4,114,89,94,93,'class1','department1'),
    ->   (5,115,99,93,89,'class1','department1'),
    ->   (6,121,96,74,79,'class2','department1'),
    ->   (7,122,89,86,85,'class2','department1'),
    ->   (8,123,70,78,61,'class2','department1'),
    ->   (9,124,76,70,76,'class2','department1'),
    ->   (10,211,89,93,60,'class1','department2'),
    ->   (11,212,76,83,75,'class1','department2'),
    ->   (12,213,71,94,90,'class1','department2'),
    ->   (13,214,94,94,66,'class1','department2'),
    ->   (14,215,84,82,73,'class1','department2'),
    ->   (15,216,85,74,93,'class1','department2'),
    ->   (16,221,77,99,61,'class2','department2'),
    ->   (17,222,80,78,96,'class2','department2'),
    ->   (18,223,79,74,96,'class2','department2'),
    ->   (19,224,75,80,78,'class2','department2'),
    ->   (20,225,82,85,63,'class2','department2');
    
   -- mysql> select *from student_scores;
+------+-----------+----------+------+---------+---------+--------------+
| id   | studentId | language | math | english | classId | departmentId |
+------+-----------+----------+------+---------+---------+--------------+
|    1 |       111 |       68 |   69 |      90 | class1  | department1  |
|    2 |       112 |       73 |   80 |      96 | class1  | department1  |
|    3 |       113 |       90 |   74 |      75 | class1  | department1  |
|    4 |       114 |       89 |   94 |      93 | class1  | department1  |
|    5 |       115 |       99 |   93 |      89 | class1  | department1  |
|    6 |       121 |       96 |   74 |      79 | class2  | department1  |
|    7 |       122 |       89 |   86 |      85 | class2  | department1  |
|    8 |       123 |       70 |   78 |      61 | class2  | department1  |
|    9 |       124 |       76 |   70 |      76 | class2  | department1  |
|   10 |       211 |       89 |   93 |      60 | class1  | department2  |
|   11 |       212 |       76 |   83 |      75 | class1  | department2  |
|   12 |       213 |       71 |   94 |      90 | class1  | department2  |
|   13 |       214 |       94 |   94 |      66 | class1  | department2  |
|   14 |       215 |       84 |   82 |      73 | class1  | department2  |
|   15 |       216 |       85 |   74 |      93 | class1  | department2  |
|   16 |       221 |       77 |   99 |      61 | class2  | department2  |
|   17 |       222 |       80 |   78 |      96 | class2  | department2  |
|   18 |       223 |       79 |   74 |      96 | class2  | department2  |
|   19 |       224 |       75 |   80 |      78 | class2  | department2  |
|   20 |       225 |       82 |   85 |      63 | class2  | department2  |
+------+-----------+----------+------+---------+---------+--------------+

2,聚合类开窗函数

count() 开窗函数

-- 查询：
select studentId,math,departmentId,classId,
-- 以符合条件的所有行作为窗口
count(math) over() as count1,
 -- 以按classId分组的所有行作为窗口
count(math) over(partition by classId) as count2,
 -- 以按classId分组、按math排序的所有行作为窗口
count(math) over(partition by classId order by math) as count3,
 -- 以按classId分组、按math排序、按 当前行+往前1行+往后2行的行作为窗口
count(math) over(partition by classId order by math rows between 1 preceding and 2 following) as count4
from student_scores where departmentId='department1';

-- 结果
+-----------+------+--------------+---------+--------+--------+--------+--------+
| studentId | math | departmentId | classId | count1 | count2 | count3 | count4 |
+-----------+------+--------------+---------+--------+--------+--------+--------+
|       111 |   69 | department1  | class1  |      9 |      5 |      1 |      3 |
|       113 |   74 | department1  | class1  |      9 |      5 |      2 |      4 |
|       112 |   80 | department1  | class1  |      9 |      5 |      3 |      4 |
|       115 |   93 | department1  | class1  |      9 |      5 |      4 |      3 |
|       114 |   94 | department1  | class1  |      9 |      5 |      5 |      2 |
|       124 |   70 | department1  | class2  |      9 |      4 |      1 |      3 |
|       121 |   74 | department1  | class2  |      9 |      4 |      2 |      4 |
|       123 |   78 | department1  | class2  |      9 |      4 |      3 |      3 |
|       122 |   86 | department1  | class2  |      9 |      4 |      4 |      2 |
+-----------+------+--------------+---------+--------+--------+--------+--------+

/*
结果解释:
studentid=115,count1为所有的行数9,count2为分区class1中的行数5,count3为分区class1中math值<=93的行数4,count4为分区class1中math值向前+1行向后+2行(实际只有1行)的总行数3。
*/

sum() 开窗函数

# sum开窗函数

select studentId,math,departmentId,classId,
-- 以符合条件的所有行作为窗口
sum(math) over() as sum1,
-- 以按classId分组的所有行作为窗口
sum(math) over(partition by classId) as sum2,
 -- 以按classId分组、按math排序后、按到当前行(含当前行)的所有行作为窗口
sum(math) over(partition by classId order by math) as sum3,
 -- 以按classId分组、按math排序后、按当前行+往前1行+往后2行的行作为窗口
sum(math) over(partition by classId order by math rows between 1 preceding and 2 following) as sum4
from student_scores where departmentId='department1';

# 结果
+-----------+------+--------------+---------+------+------+------+------+
| studentId | math | departmentId | classId | sum1 | sum2 | sum3 | sum4 |
+-----------+------+--------------+---------+------+------+------+------+
|       111 |   69 | department1  | class1  |  718 |  410 |   69 |  223 |
|       113 |   74 | department1  | class1  |  718 |  410 |  143 |  316 |
|       112 |   80 | department1  | class1  |  718 |  410 |  223 |  341 |
|       115 |   93 | department1  | class1  |  718 |  410 |  316 |  267 |
|       114 |   94 | department1  | class1  |  718 |  410 |  410 |  187 |
|       124 |   70 | department1  | class2  |  718 |  308 |   70 |  222 |
|       121 |   74 | department1  | class2  |  718 |  308 |  144 |  308 |
|       123 |   78 | department1  | class2  |  718 |  308 |  222 |  238 |
|       122 |   86 | department1  | class2  |  718 |  308 |  308 |  164 |
+-----------+------+--------------+---------+------+------+------+------+

# 结果解释:同count开窗函数

min() 开窗函数

-- min 开窗函数

select studentId,math,departmentId,classId,
-- 以符合条件的所有行作为窗口
min(math) over() as min1,
-- 以按classId分组的所有行作为窗口
min(math) over(partition by classId) as min2,
 -- 以按classId分组、按math排序后、按到当前行(含当前行)的所有行作为窗口
min(math) over(partition by classId order by math) as min3,
 -- 以按classId分组、按math排序后、按当前行+往前1行+往后2行的行作为窗口
min(math) over(partition by classId order by math rows between 1 preceding and 2 following) as min4
from student_scores where departmentId='department1';

-- 结果
+-----------+------+--------------+---------+------+------+------+------+
| studentId | math | departmentId | classId | min1 | min2 | min3 | min4 |
+-----------+------+--------------+---------+------+------+------+------+
|       111 |   69 | department1  | class1  |   69 |   69 |   69 |   69 |
|       113 |   74 | department1  | class1  |   69 |   69 |   69 |   69 |
|       112 |   80 | department1  | class1  |   69 |   69 |   69 |   74 |
|       115 |   93 | department1  | class1  |   69 |   69 |   69 |   80 |
|       114 |   94 | department1  | class1  |   69 |   69 |   69 |   93 |
|       124 |   70 | department1  | class2  |   69 |   70 |   70 |   70 |
|       121 |   74 | department1  | class2  |   69 |   70 |   70 |   70 |
|       123 |   78 | department1  | class2  |   69 |   70 |   70 |   74 |
|       122 |   86 | department1  | class2  |   69 |   70 |   70 |   78 |
+-----------+------+--------------+---------+------+------+------+------+

# 结果解释:同count开窗函数

max() 开窗函数

-- min 开窗函数

select studentId,math,departmentId,classId,
-- 以符合条件的所有行作为窗口
min(math) over() as min1,
-- 以按classId分组的所有行作为窗口
min(math) over(partition by classId) as min2,
 -- 以按classId分组、按math排序后、按到当前行(含当前行)的所有行作为窗口
min(math) over(partition by classId order by math) as min3,
 -- 以按classId分组、按math排序后、按当前行+往前1行+往后2行的行作为窗口
min(math) over(partition by classId order by math rows between 1 preceding and 2 following) as min4
from student_scores where departmentId='department1';

-- 结果
+-----------+------+--------------+---------+------+------+------+------+
| studentId | math | departmentId | classId | max1 | max2 | max3 | max4 |
+-----------+------+--------------+---------+------+------+------+------+
|       111 |   69 | department1  | class1  |   94 |   94 |   69 |   80 |
|       113 |   74 | department1  | class1  |   94 |   94 |   74 |   93 |
|       112 |   80 | department1  | class1  |   94 |   94 |   80 |   94 |
|       115 |   93 | department1  | class1  |   94 |   94 |   93 |   94 |
|       114 |   94 | department1  | class1  |   94 |   94 |   94 |   94 |
|       124 |   70 | department1  | class2  |   94 |   86 |   70 |   78 |
|       121 |   74 | department1  | class2  |   94 |   86 |   74 |   86 |
|       123 |   78 | department1  | class2  |   94 |   86 |   78 |   86 |
|       122 |   86 | department1  | class2  |   94 |   86 |   86 |   86 |
+-----------+------+--------------+---------+------+------+------+------+

# 结果解释:同count开窗函数

avg() 开窗函数

-- avg 开窗函数

select studentId,math,departmentId,classId,
-- 以符合条件的所有行作为窗口
avg(math) over() as avg1,
-- 以按classId分组的所有行作为窗口
avg(math) over(partition by classId) as avg2,
 -- 以按classId分组、按math排序后、按到当前行(含当前行)的所有行作为窗口
avg(math) over(partition by classId order by math) as avg3,
 -- 以按classId分组、按math排序后、按当前行+往前1行+往后2行的行作为窗口
avg(math) over(partition by classId order by math rows between 1 preceding and 2 following) as avg4
from student_scores where departmentId='department1';

-- 结果
+-----------+------+--------------+---------+---------+---------+---------+---------+
| studentId | math | departmentId | classId | avg1    | avg2    | avg3    | avg4    |
+-----------+------+--------------+---------+---------+---------+---------+---------+
|       111 |   69 | department1  | class1  | 79.7778 | 82.0000 | 69.0000 | 74.3333 |
|       113 |   74 | department1  | class1  | 79.7778 | 82.0000 | 71.5000 | 79.0000 |
|       112 |   80 | department1  | class1  | 79.7778 | 82.0000 | 74.3333 | 85.2500 |
|       115 |   93 | department1  | class1  | 79.7778 | 82.0000 | 79.0000 | 89.0000 |
|       114 |   94 | department1  | class1  | 79.7778 | 82.0000 | 82.0000 | 93.5000 |
|       124 |   70 | department1  | class2  | 79.7778 | 77.0000 | 70.0000 | 74.0000 |
|       121 |   74 | department1  | class2  | 79.7778 | 77.0000 | 72.0000 | 77.0000 |
|       123 |   78 | department1  | class2  | 79.7778 | 77.0000 | 74.0000 | 79.3333 |
|       122 |   86 | department1  | class2  | 79.7778 | 77.0000 | 77.0000 | 82.0000 |
+-----------+------+--------------+---------+---------+---------+---------+---------+

# 结果解释:同count开窗函数

3,专用开窗函数

排序开窗函数

rank() 开窗函数

rank 开窗函数基于 over 子句中的 order by 确定一组值中一个值的排名。如果存在partition by ,则为每个分区组中的每个值排名。排名可能不是连续的。例如，如果两个行的排名为 1，则下一个排名为 3（即：1、2、2、4）。

-- rank 开窗函数

select *,
-- 对全部学生按数学分数排序 
rank() over(order by math) as rank1,
-- 对院系 按数学分数排序
rank() over(partition by departmentId order by math) as rank2,
-- 对每个院系每个班级 按数学分数排序
rank() over(partition by departmentId,classId order by math) as rank3
from student_scores;

-- 结果
+------+-----------+----------+------+---------+---------+--------------+-------+-------+-------+
| id   | studentId | language | math | english | classId | departmentId | rank1 | rank2 | rank3 |
+------+-----------+----------+------+---------+---------+--------------+-------+-------+-------+
|    1 |       111 |       68 |   69 |      90 | class1  | department1  |     1 |     1 |     1 |
|    3 |       113 |       90 |   74 |      75 | class1  | department1  |     3 |     3 |     2 |
|    2 |       112 |       73 |   80 |      96 | class1  | department1  |     9 |     6 |     3 |
|    5 |       115 |       99 |   93 |      89 | class1  | department1  |    15 |     8 |     4 |
|    4 |       114 |       89 |   94 |      93 | class1  | department1  |    17 |     9 |     5 |
|    9 |       124 |       76 |   70 |      76 | class2  | department1  |     2 |     2 |     1 |
|    6 |       121 |       96 |   74 |      79 | class2  | department1  |     3 |     3 |     2 |
|    8 |       123 |       70 |   78 |      61 | class2  | department1  |     7 |     5 |     3 |
|    7 |       122 |       89 |   86 |      85 | class2  | department1  |    14 |     7 |     4 |
|   15 |       216 |       85 |   74 |      93 | class1  | department2  |     3 |     1 |     1 |
|   14 |       215 |       84 |   82 |      73 | class1  | department2  |    11 |     5 |     2 |
|   11 |       212 |       76 |   83 |      75 | class1  | department2  |    12 |     6 |     3 |
|   10 |       211 |       89 |   93 |      60 | class1  | department2  |    15 |     8 |     4 |
|   12 |       213 |       71 |   94 |      90 | class1  | department2  |    17 |     9 |     5 |
|   13 |       214 |       94 |   94 |      66 | class1  | department2  |    17 |     9 |     5 |
|   18 |       223 |       79 |   74 |      96 | class2  | department2  |     3 |     1 |     1 |
|   17 |       222 |       80 |   78 |      96 | class2  | department2  |     7 |     3 |     2 |
|   19 |       224 |       75 |   80 |      78 | class2  | department2  |     9 |     4 |     3 |
|   20 |       225 |       82 |   85 |      63 | class2  | department2  |    13 |     7 |     4 |
|   16 |       221 |       77 |   99 |      61 | class2  | department2  |    20 |    11 |     5 |
+------+-----------+----------+------+---------+---------+--------------+-------+-------+-------+

# 结果解释

dense_rank() 开窗函数

dense_rank与rank有一点不同,当排名一样的时候,接下来的行是连续的。如两个行的排名为 1，则下一个排名为 2（即：1、2、2、3）。

-- dense_rank 开窗函数

select *,
-- 对全部学生按数学分数排序
dense_rank() over(order by math) as den_r1,
-- 对院系 按数学分数排序
dense_rank() over(partition by departmentId order by math) as den_r2,
-- 对每个院系每个班级 按数学分数排序
dense_rank() over(partition by departmentId,classId order by math) as den_r3
from student_scores;
-- 结果
+------+-----------+----------+------+---------+---------+--------------+--------+--------+--------+
| id   | studentId | language | math | english | classId | departmentId | den_r1 | den_r2 | den_r3 |
+------+-----------+----------+------+---------+---------+--------------+--------+--------+--------+
|    1 |       111 |       68 |   69 |      90 | class1  | department1  |      1 |      1 |      1 |
|    3 |       113 |       90 |   74 |      75 | class1  | department1  |      3 |      3 |      2 |
|    2 |       112 |       73 |   80 |      96 | class1  | department1  |      5 |      5 |      3 |
|    5 |       115 |       99 |   93 |      89 | class1  | department1  |     10 |      7 |      4 |
|    4 |       114 |       89 |   94 |      93 | class1  | department1  |     11 |      8 |      5 |
|    9 |       124 |       76 |   70 |      76 | class2  | department1  |      2 |      2 |      1 |
|    6 |       121 |       96 |   74 |      79 | class2  | department1  |      3 |      3 |      2 |
|    8 |       123 |       70 |   78 |      61 | class2  | department1  |      4 |      4 |      3 |
|    7 |       122 |       89 |   86 |      85 | class2  | department1  |      9 |      6 |      4 |
|   15 |       216 |       85 |   74 |      93 | class1  | department2  |      3 |      1 |      1 |
|   14 |       215 |       84 |   82 |      73 | class1  | department2  |      6 |      4 |      2 |
|   11 |       212 |       76 |   83 |      75 | class1  | department2  |      7 |      5 |      3 |
|   10 |       211 |       89 |   93 |      60 | class1  | department2  |     10 |      7 |      4 |
|   12 |       213 |       71 |   94 |      90 | class1  | department2  |     11 |      8 |      5 |
|   13 |       214 |       94 |   94 |      66 | class1  | department2  |     11 |      8 |      5 |
|   18 |       223 |       79 |   74 |      96 | class2  | department2  |      3 |      1 |      1 |
|   17 |       222 |       80 |   78 |      96 | class2  | department2  |      4 |      2 |      2 |
|   19 |       224 |       75 |   80 |      78 | class2  | department2  |      5 |      3 |      3 |
|   20 |       225 |       82 |   85 |      63 | class2  | department2  |      8 |      6 |      4 |
|   16 |       221 |       77 |   99 |      61 | class2  | department2  |     12 |      9 |      5 |
+------+-----------+----------+------+---------+---------+--------------+--------+--------+--------+

# 结果解释:同count开窗函数

row_number() 开窗函数

从1开始对分区内的数据排序。

-- row_number 开窗函数

select studentid,departmentid,classid,math,
-- 对分区departmentid,classid内的数据按math排序
row_number() over(partition by departmentid,classid order by math) as row_num
from student_scores;
-- 结果
+-----------+--------------+---------+------+---------+
| studentid | departmentid | classid | math | row_num |
+-----------+--------------+---------+------+---------+
|       111 | department1  | class1  |   69 |       1 |
|       113 | department1  | class1  |   74 |       2 |
|       112 | department1  | class1  |   80 |       3 |
|       115 | department1  | class1  |   93 |       4 |
|       114 | department1  | class1  |   94 |       5 |
|       124 | department1  | class2  |   70 |       1 |
|       121 | department1  | class2  |   74 |       2 |
|       123 | department1  | class2  |   78 |       3 |
|       122 | department1  | class2  |   86 |       4 |
|       216 | department2  | class1  |   74 |       1 |
|       215 | department2  | class1  |   82 |       2 |
|       212 | department2  | class1  |   83 |       3 |
|       211 | department2  | class1  |   93 |       4 |
|       213 | department2  | class1  |   94 |       5 |
|       214 | department2  | class1  |   94 |       6 |
|       223 | department2  | class2  |   74 |       1 |
|       222 | department2  | class2  |   78 |       2 |
|       224 | department2  | class2  |   80 |       3 |
|       225 | department2  | class2  |   85 |       4 |
|       221 | department2  | class2  |   99 |       5 |
+-----------+--------------+---------+------+---------+

# 结果解释:注意参数不要与关键字 row_number()重名。

分布函数：

percent_rank() 开窗函数

计算给定行的百分比排名。可以用来计算超过了百分之多少的人。如360小助手开机速度超过了百分之多少的人。
(当前行的rank值-1)/(分组内的总行数-1)

-- percent_rank 开窗函数

select studentid,departmentid,classid,math,
row_number() over(partition by departmentid,classid order by math) as row_num,
percent_rank() over(partition by departmentid,classid order by math) as per_rank
from student_scores;

-- 结果
+-----------+--------------+---------+------+---------+--------------------+
| studentid | departmentid | classid | math | row_num | per_rank           |
+-----------+--------------+---------+------+---------+--------------------+
|       111 | department1  | class1  |   69 |       1 |                  0 |
|       113 | department1  | class1  |   74 |       2 |               0.25 |
|       112 | department1  | class1  |   80 |       3 |                0.5 |
|       115 | department1  | class1  |   93 |       4 |               0.75 |
|       114 | department1  | class1  |   94 |       5 |                  1 |
|       124 | department1  | class2  |   70 |       1 |                  0 |
|       121 | department1  | class2  |   74 |       2 | 0.3333333333333333 |
|       123 | department1  | class2  |   78 |       3 | 0.6666666666666666 |
|       122 | department1  | class2  |   86 |       4 |                  1 |
|       216 | department2  | class1  |   74 |       1 |                  0 |
|       215 | department2  | class1  |   82 |       2 |                0.2 |
|       212 | department2  | class1  |   83 |       3 |                0.4 |
|       211 | department2  | class1  |   93 |       4 |                0.6 |
|       213 | department2  | class1  |   94 |       5 |                0.8 |
|       214 | department2  | class1  |   94 |       6 |                0.8 |
|       223 | department2  | class2  |   74 |       1 |                  0 |
|       222 | department2  | class2  |   78 |       2 |               0.25 |
|       224 | department2  | class2  |   80 |       3 |                0.5 |
|       225 | department2  | class2  |   85 |       4 |               0.75 |
|       221 | department2  | class2  |   99 |       5 |                  1 |
+-----------+--------------+---------+------+---------+--------------------+

# 结果解释:

cume_dist() 开窗函数

计算某个窗口或分区中某个值的累积分布。假定升序排序，则使用以下公式确定累积分布：
小于等于当前值x的行数 / 窗口或partition分区内的总行数。其中，x 等于 order by 子句中指定的列的当前行中的值。

-- cume_dist 开窗函数

select studentId,math,departmentId,classId,
-- 统计小于等于当前分数的人数占总人数的比例
cume_dist() over(order by math) as cume_dist1,
-- 统计大于等于当前分数的人数占总人数的比例
cume_dist() over(order by math desc) as cume_dist2,
-- 统计分区内小于等于当前分数的人数占总人数的比例
cume_dist() over(partition by classId order by math) as cume_dist3
from student_scores where departmentId='department1';

-- 结果
+-----------+------+--------------+---------+--------------------+--------------------+------------+
| studentId | math | departmentId | classId | cume_dist1         | cume_dist2         | cume_dist3 |
+-----------+------+--------------+---------+--------------------+--------------------+------------+
|       111 |   69 | department1  | class1  | 0.1111111111111111 |                  1 |        0.2 |
|       113 |   74 | department1  | class1  | 0.4444444444444444 | 0.7777777777777778 |        0.4 |
|       112 |   80 | department1  | class1  | 0.6666666666666666 | 0.4444444444444444 |        0.6 |
|       115 |   93 | department1  | class1  | 0.8888888888888888 | 0.2222222222222222 |        0.8 |
|       114 |   94 | department1  | class1  |                  1 | 0.1111111111111111 |          1 |
|       124 |   70 | department1  | class2  | 0.2222222222222222 | 0.8888888888888888 |       0.25 |
|       121 |   74 | department1  | class2  | 0.4444444444444444 | 0.7777777777777778 |        0.5 |
|       123 |   78 | department1  | class2  | 0.5555555555555556 | 0.5555555555555556 |       0.75 |
|       122 |   86 | department1  | class2  | 0.7777777777777778 | 0.3333333333333333 |          1 |
+-----------+------+--------------+---------+--------------------+--------------------+------------+

# 结果解释:

前后函数：

lag(expr,n) 开窗函数

lag(col,n,default) 用于统计窗口内往上第n个值。
col:列名
n:往上第n行
default:往上第n行为NULL时候，取默认值,不指定则取NULL

-- lag 开窗函数

select studentId,math,departmentId,classId,
 -- 窗口内 往上取第二个 取不到时赋默认值60
lag(math,2,60) over(partition by classId order by math) as lag1,
 -- 窗口内 往上取第二个 取不到时赋默认值NULL
lag(math,2) over(partition by classId order by math) as lag2
from student_scores where departmentId='department1';

-- 结果
+-----------+------+--------------+---------+------+------+
| studentId | math | departmentId | classId | lag1 | lag2 |
+-----------+------+--------------+---------+------+------+
|       111 |   69 | department1  | class1  |   60 | NULL |
|       113 |   74 | department1  | class1  |   60 | NULL |
|       112 |   80 | department1  | class1  |   69 |   69 |
|       115 |   93 | department1  | class1  |   74 |   74 |
|       114 |   94 | department1  | class1  |   80 |   80 |
|       124 |   70 | department1  | class2  |   60 | NULL |
|       121 |   74 | department1  | class2  |   60 | NULL |
|       123 |   78 | department1  | class2  |   70 |   70 |
|       122 |   86 | department1  | class2  |   74 |   74 |
+-----------+------+--------------+---------+------+------+

/*
结果解释:
    第3行(studentId=112),lag1:窗口内(69 74 80) 当前行80 向上取第二个值为69
    倒数第3行(studentId=121),lag2:窗口内(70 74) 当前行74 向上取第二个值为NULL
*/

lead(expr,n) 开窗函数

lead(col,n,default) 用于统计窗口内往下第n个值。
col:列名
n:往下第n行
default:往下第n行为NULL时候，取默认值,不指定则取NULL

-- lead开窗函数

select studentId,math,departmentId,classId,
 -- 窗口内 往下取第二个 取不到时赋默认值60
lead(math,2,60) over(partition by classId order by math) as lead1,
 -- 窗口内 往下取第二个 取不到时赋默认值NULL
lead(math,2) over(partition by classId order by math) as lead2
from student_scores where departmentId='department1';

-- 结果
+-----------+------+--------------+---------+-------+-------+
| studentId | math | departmentId | classId | lead1 | lead2 |
+-----------+------+--------------+---------+-------+-------+
|       111 |   69 | department1  | class1  |    80 |    80 |
|       113 |   74 | department1  | class1  |    93 |    93 |
|       112 |   80 | department1  | class1  |    94 |    94 |
|       115 |   93 | department1  | class1  |    60 |  NULL |
|       114 |   94 | department1  | class1  |    60 |  NULL |
|       124 |   70 | department1  | class2  |    78 |    78 |
|       121 |   74 | department1  | class2  |    86 |    86 |
|       123 |   78 | department1  | class2  |    60 |  NULL |
|       122 |   86 | department1  | class2  |    60 |  NULL |
+-----------+------+--------------+---------+-------+-------+

# 结果解释:第4行lead1 窗口内向下第二个值为空，赋值60

头尾函数：

first_value(expr) 开窗函数

返回分区中的第一个值。

-- first_value 开窗函数

select studentId,math,departmentId,classId,
-- 以符合条件的所有行作为窗口
first_value(math) over() as f_v1,
-- 以按classId分组的所有行作为窗口
first_value(math) over(partition by classId) as f_v2,
 -- 以按classId分组、按math排序后、按到当前行(含当前行)的所有行作为窗口
first_value(math) over(partition by classId order by math) as f_v3,
 -- 以按classId分组、按math排序后、按当前行+往前1行+往后2行的行作为窗口
first_value(math) over(partition by classId order by math rows between 1 preceding and 2 following) as f_v4
from student_scores where departmentId='department1';

-- 结果
+-----------+------+--------------+---------+------+------+------+------+
| studentId | math | departmentId | classId | f_v1 | f_v2 | f_v3 | f_v4 |
+-----------+------+--------------+---------+------+------+------+------+
|       111 |   69 | department1  | class1  |   69 |   69 |   69 |   69 |
|       113 |   74 | department1  | class1  |   69 |   69 |   69 |   69 |
|       112 |   80 | department1  | class1  |   69 |   69 |   69 |   74 |
|       115 |   93 | department1  | class1  |   69 |   69 |   69 |   80 |
|       114 |   94 | department1  | class1  |   69 |   69 |   69 |   93 |
|       124 |   70 | department1  | class2  |   69 |   74 |   70 |   70 |
|       121 |   74 | department1  | class2  |   69 |   74 |   70 |   70 |
|       123 |   78 | department1  | class2  |   69 |   74 |   70 |   74 |
|       122 |   86 | department1  | class2  |   69 |   74 |   70 |   78 |
+-----------+------+--------------+---------+------+------+------+------+

/*
结果解释：
1，studentid=124 first_value1:第一个值是69,first_value2:classId=class1分区 math的第一个值是69。
2，studentId=124行，为什么 f_v2=74？
因为：f_v2，只是分组，并没有排序，而 studentId=121 行，在class2中是第一个出现的，所以f_v2就取了studentId=121行，math的值，正是，74。
student_scores 表单数据，参考 数据准备的数据输出。
3，注意：order by之后， rows between num1 preceding and num2 followin 的使用。
 // order by math rows between 1 preceding and 2 followin
*/

last_value(expr) 开窗函数

返回分区中的最后一个值。

-- last_value 开窗函数

select studentId,math,departmentId,classId,
-- 以符合条件的所有行作为窗口
last_value(math) over() as last_v1,
-- 以按classId分组的所有行作为窗口
last_value(math) over(partition by classId) as last_v2,
 -- 以按classId分组、按math排序后、按到当前行(含当前行)的所有行作为窗口
last_value(math) over(partition by classId order by math) as last_v3,
 -- 以按classId分组、按math排序后、按当前行+往前1行+往后2行的行作为窗口
last_value(math) over(partition by classId order by math rows between 1 preceding and 2 following) as last_v4
from student_scores where departmentId='department1';

-- 结果
+-----------+------+--------------+---------+---------+---------+---------+---------+
| studentId | math | departmentId | classId | last_v1 | last_v2 | last_v3 | last_v4 |
+-----------+------+--------------+---------+---------+---------+---------+---------+
|       111 |   69 | department1  | class1  |      70 |      93 |      69 |      80 |
|       113 |   74 | department1  | class1  |      70 |      93 |      74 |      93 |
|       112 |   80 | department1  | class1  |      70 |      93 |      80 |      94 |
|       115 |   93 | department1  | class1  |      70 |      93 |      93 |      94 |
|       114 |   94 | department1  | class1  |      70 |      93 |      94 |      94 |
|       124 |   70 | department1  | class2  |      70 |      70 |      70 |      78 |
|       121 |   74 | department1  | class2  |      70 |      70 |      74 |      86 |
|       123 |   78 | department1  | class2  |      70 |      70 |      78 |      86 |
|       122 |   86 | department1  | class2  |      70 |      70 |      86 |      86 |
+-----------+------+--------------+---------+---------+---------+---------+---------+

/*
结果解释:
studentId=111 行，last_v1=70 参考student_scores表单原始数据。
studentId=111 行，last_v2=93 参考student_scores表单原始数据。
*/

其他函数：

ntile(n) 开窗函数

将分区中已排序的行划分为大小尽可能相等的指定数量的排名的组，并返回给定行所在的组的排名。

即：尽可能的平均分租，返回所在的组是第几组。

-- ntile 开窗函数

select studentId,math,classId,departmentId,
-- 对分区内的数据分成两组
ntile(2) over(partition by departmentid order by math) as n_1,
-- 对分区内的数据分成三组
ntile(3) over(partition by departmentid order by math) as n_2
from student_scores;

-- 结果
+-----------+------+---------+--------------+------+------+
| studentId | math | classId | departmentId | n_1  | n_2  |
+-----------+------+---------+--------------+------+------+
|       111 |   69 | class1  | department1  |    1 |    1 |
|       124 |   70 | class2  | department1  |    1 |    1 |
|       113 |   74 | class1  | department1  |    1 |    1 |
|       121 |   74 | class2  | department1  |    1 |    2 |
|       123 |   78 | class2  | department1  |    1 |    2 |
|       112 |   80 | class1  | department1  |    2 |    2 |
|       122 |   86 | class2  | department1  |    2 |    3 |
|       115 |   93 | class1  | department1  |    2 |    3 |
|       114 |   94 | class1  | department1  |    2 |    3 |
|       216 |   74 | class1  | department2  |    1 |    1 |
|       223 |   74 | class2  | department2  |    1 |    1 |
|       222 |   78 | class2  | department2  |    1 |    1 |
|       224 |   80 | class2  | department2  |    1 |    1 |
|       215 |   82 | class1  | department2  |    1 |    2 |
|       212 |   83 | class1  | department2  |    1 |    2 |
|       225 |   85 | class2  | department2  |    2 |    2 |
|       211 |   93 | class1  | department2  |    2 |    2 |
|       213 |   94 | class1  | department2  |    2 |    3 |
|       214 |   94 | class1  | department2  |    2 |    3 |
|       221 |   99 | class2  | department2  |    2 |    3 |
+-----------+------+---------+--------------+------+------+

/*
结果解释:
n_1:department1 被分为2组，第一组5个，第二组：4个。department2 被分为2组，第一组6个，第二组5个。
n_2:department1 被分为3组，第一组3个，第二组：3个，第三组3个。department2 被分为3组，第一组4个，第二组4个，第三组3个。
*/

NTH_VALU(expr,n) 开窗函数

-- NTH_VALU 开窗函数

SELECT  studentId ,math , classId ,departmentId,
-- 按照departmentid分区，math降序，取第二个的成绩。
nth_value(math, 2)OVER (PARTITION BY departmentid ORDER BY math DESC)  AS n_1 ,
-- 按照departmentid分区，math升序，取第一个的成绩。
nth_value(math, 1)OVER (PARTITION BY departmentid ORDER BY math asc)  AS n_2
FROM student_scores;

-- 结果
+-----------+------+---------+--------------+------+------+
| studentId | math | classId | departmentId | n_1  | n_2  |
+-----------+------+---------+--------------+------+------+
|       111 |   69 | class1  | department1  |   93 |   69 |
|       124 |   70 | class2  | department1  |   93 |   69 |
|       113 |   74 | class1  | department1  |   93 |   69 |
|       121 |   74 | class2  | department1  |   93 |   69 |
|       123 |   78 | class2  | department1  |   93 |   69 |
|       112 |   80 | class1  | department1  |   93 |   69 |
|       122 |   86 | class2  | department1  |   93 |   69 |
|       115 |   93 | class1  | department1  |   93 |   69 |
|       114 |   94 | class1  | department1  | NULL |   69 |
|       216 |   74 | class1  | department2  |   94 |   74 |
|       223 |   74 | class2  | department2  |   94 |   74 |
|       222 |   78 | class2  | department2  |   94 |   74 |
|       224 |   80 | class2  | department2  |   94 |   74 |
|       215 |   82 | class1  | department2  |   94 |   74 |
|       212 |   83 | class1  | department2  |   94 |   74 |
|       225 |   85 | class2  | department2  |   94 |   74 |
|       211 |   93 | class1  | department2  |   94 |   74 |
|       213 |   94 | class1  | department2  |   94 |   74 |
|       214 |   94 | class1  | department2  |   94 |   74 |
|       221 |   99 | class2  | department2  | NULL |   74 |
+-----------+------+---------+--------------+------+------+

# 结果解释:为什么会出现 NULL？

五，常见报错

1，ERROR 1064

MySQL ERROR 1064 (42000)：
-- 1，使用了系统保留的关键字，如果非要使用，对关键字家反斜杠：`keyw0rd`
-- 2，没有权限；
-- 3，在给列设置string类型的时候，如果指定列的类型为varchar则需要指定该类型的最大‘容量值‘，若为char类型则不必指定。

2，创建表字符串类型：

-- MySQL 声明字符串：
name varchar(10)
-- hive 声明字符串：
name string

3，插入数据：

MySQL insert into student_scores values 
-- 注意：insert into table student_scores values  //不能带 table。

4，关键字重名

-- 报错：
row_number()  over(partition by departmentid,classId order by math) as row_number 
-- 正确：
row_number() over(partition by departmentId,classId order by math) as row_num 

-- 原因：as row_number 与 row_number()  属于关键字row_number重名；

5，注释不规范

MySQL语句中，做注释，#、-- 和注释内容之间需要加空格。否则报错。

ERROR 1064 (42000): xxxxx; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘默认60
-- 注释

克己勤勉。

最后编辑于：2021.03.03 17:51:25

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 216,843评论 6赞 502
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 92,538评论 3赞 392
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 163,187评论 0赞 353
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 58,264评论 1赞 292
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 67,289评论 6赞 390
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 51,231评论 1赞 299
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 40,116评论 3赞 418
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 38,945评论 0赞 275
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 45,367评论 1赞 313
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 37,581评论 2赞 333
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 39,754评论 1赞 348
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 35,458评论 5赞 344
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 41,068评论 3赞 327
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 31,692评论 0赞 22
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 32,842评论 1赞 269
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 47,797评论 2赞 369
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 44,654评论 2赞 354