【Oracle学习09】分组函数
单行函数在结果集中返回各行的单值。 分组或聚合函数作用于多行。
9.1 分组函数
9.1.1 分组函数定义
分组函数做用于聚合数据,并且针对每个组返回单个结果。 分组函数对每个组返回一个结果。
9.1.2 使用分组函数
AVG,SUM,MIN,MAX,COUNT返回数字结果。 MIN,MAX还可以返回日期或字符结果。
SQL> SELECT AVG(salary), MAX(salary), MIN(salary), SUM(salary) FROM employees
WHERE job_id LIKE '%REP%';
AVG(SALARY) MAX(SALARY) MIN(SALARY) SUM(SALARY)
----------- ----------- ----------- -----------
8272.72727 11500 6000 273000
1)COUNT
- count(*) : 计算所有行(包括空值和重复值的行)的行数目。
- count(expr):若expr是列名,则会忽略空值。若不是列名,则会对每一行都计算。
- count(distinct expr): 去重,若是列名,会忽略空值。希望有null,可以用nvl(colum,0)来处理。
- count(all expr): 等同于count(expr) 。
#示例
# count(1)等同于count(*)会计算每一行。
SQL> select count(*) , count(1) ,count(MANAGER_ID),count(distinct manager_id) from employees;
COUNT(*) COUNT(1) COUNT(MANAGER_ID) COUNT(DISTINCTMANAGER_ID)
---------- ---------- ----------------- -------------------------
107 107 106 18
#忽略空值。
SQL> select count(distinct department_id),count(distinct nvl(department_id,0)) from employees;
COUNT(DISTINCTDEPARTMENT_ID) COUNT(DISTINCTNVL(DEPARTMENT_ID,0))
---------------------------- -----------------------------------
11 12
2)SUM
- SUM( expr) : expr只能为数字,并且忽略空值。
- SUM(DISTINCT expr): 对每一个唯一值进行相加计算。
#SUM( expr) 略过空值
SQL> select sum(2),count(*),count(MANAGER_ID) from employees;
SUM(2) COUNT(*) COUNT(MANAGER_ID)
---------- ---------- -----------------
214 107 106
#
SQL> select sum(salary),sum(distinct salary) from employees;
SUM(SALARY) SUM(DISTINCTSALARY)
----------- -------------------
691400 397900
3)AVG 函数
- AVG(all expr): 求和除以组中非空行数。
- AVG(DISTINCT expr): 先得到distinct的记录集,求和再除于distinct记录集数。
#空的处理
SQL> SELECT COUNT(DISTINCT department_id),count(distinct nvl(department_id,0)) FROM employees;
COUNT(DISTINCTDEPARTMENT_ID) COUNT(DISTINCTNVL(DEPARTMENT_ID,0))
---------------------------- -----------------------------------
11 12
#avg
SQL> select avg(2),sum(salary),sum(salary)/107,avg(salary),count(*) from employees;
AVG(2) SUM(SALARY) SUM(SALARY)/107 AVG(SALARY) COUNT(*)
---------- ----------- --------------- ----------- ----------
2 691400 6461.68224 6461.68224 107
#AVG(DISTINCT expr) 是除于 distinct的58行,不是107行。
SQL> select sum(salary),sum(salary)/107,avg(salary),sum(distinct salary)/107,avg(distinct salary) from employees;
SUM(SALARY) SUM(SALARY)/107 AVG(SALARY) SUM(DISTINCTSALARY)/107 AVG(DISTINCTSALARY)
----------- --------------- ----------- ----------------------- -------------------
691400 6461.68224 6461.68224 3718.69159 6980.70175
4)STDDEV/VARIANCE
- STDDEV: 样本标准偏差
- VARIANCE:
SQL> select salary from employees where department_id=90;
SALARY
----------
24000
17000
17000
SQL> select avg(salary),variance(salary),stddev(salary) from employees where department_id=90;
AVG(SALARY) VARIANCE(SALARY) STDDEV(SALARY)
----------- ---------------- --------------
19333.3333 16333333.3 4041.45188
5)MAX 和MIN
- MAX :作用于NUMBER,DATE,CHAR,VARCHAR2,取最大值,忽略空值。
- MIN : 取最小值,忽略空值.
SQL> select min(commission_pct),max(commission_pct) from employees;
MIN(COMMISSION_PCT) MAX(COMMISSION_PCT)
------------------- -------------------
.1 .4
SQL> select min(job_id),max(job_id) from employees;
MIN(JOB_ID) MAX(JOB_ID)
-------------------- --------------------
AC_ACCOUNT ST_MAN
6) LISTAGG函数
- LISTAGG : 返回列值的字符串汇总。
语法: LISTAGG(expr,['delimiter']) WITHIN GROUP (ORDER_BY_CLAUSE)
SQL> select * from regions order by region_name;
REGION_ID REGION_NAME
---------- --------------------------------------------------
2 Americas
3 Asia
1 Europe
4 Middle East and Africa
SQL> select listagg(region_name,',') within group (order by region_name desc ) from regions;
LISTAGG(REGION_NAME,',')WITHINGROUP(ORDERBYREGION_NAMEDESC)
---------------------------------------------------------------------------------------------------------------------
Middle East and Africa,Europe,Asia,Americas
#
select * from countries order by region_id ,country_name desc;
SQL> select listagg(country_name,',') within group (order by region_id,country_name desc ) countries from countries group by region_id;
COUNTRIES
----------------------------------------------------------------------------------------------------
United Kingdom,Switzerland,Netherlands,Italy,Germany,France,Denmark,Belgium
United States of America,Mexico,Canada,Brazil,Argentina
Singapore,Japan,India,HongKong,China,Australia
Zimbabwe,Zambia,Nigeria,Kuwait,Israel,Egypt
9.2 使用GROUP BY 子句分组数据
9.2.1 数据分组
SQL> select count(distinct department_id),count(distinct nvl(department_id,0)) from employees;
COUNT(DISTINCTDEPARTMENT_ID) COUNT(DISTINCTNVL(DEPARTMENT_ID,0))
---------------------------- -----------------------------------
11 12
SQL> select distinct department_id from employees group by department_id;
DEPARTMENT_ID
-------------
100
30
20
70
90
110
50
40
80
10
60
12 rows selected.
9.2.2 GROUP BY 子句
- Group by 出现在where 和order by之间。
- 列必须是分组特性(grouping attribute,即group by 后的列名或表达式)或分组函数中。
- Group by column 可以不出现在select 列中。
#列必须是分组特性(grouping attribute,即group by 后的列名或表达式)或分组函数中
#All the columns in the SELECT list that are not in group functions must be in the GROUP BY clause.
SQL> select end_date,count(*) from job_history; --end_date不该出现
ERROR at line 1:
ORA-00937: not a single-group group function
#start_date不该出现
SQL> select end_date,start_date,count(*) from job_history group by end_date;
select end_date,start_date,count(*) from job_history group by end_date
*
ERROR at line 1:
ORA-00979: not a GROUP BY expression
#不能用 WHERE clause to restrict groups. 该使用having限制分组
SQL> SELECT department_id, AVG(salary) FROM employees
WHERE AVG(salary) > 8000
GROUP BY department_id;
WHERE AVG(salary) > 8000
*
ERROR at line 2:
ORA-00934: group function is not allowed here
SQL> SELECT department_id, AVG(salary) FROM employees
having AVG(salary) > 8000
GROUP BY department_id;
DEPARTMENT_ID AVG(SALARY)
------------- -----------
100 8600
20 9500
70 10000
90 19333.3333
110 10150
80 8955.88235
6 rows selected.
#终止工作的年份和人数。
SQL> select to_char(end_date,'yyyy') year, count(*) "number of employees" from job_history group by to_char(end_date,'yyyy') order by count(*) desc;
YEAR number of employees
-------- -------------------
1999 4
1998 3
1993 2
1997 1
#The GROUP BY column does not have to be in the SELECT list.
SQL> SELECT AVG(salary) FROM employees
GROUP BY department_id ;
-----------
8600
4150
7000
9500
10000
19333.3333
10150
3475.55556
6500
8955.88235
4400
5760
9.2.3 按多列分组
可以数据集组进一步划分成多个子组。
SQL> select department_id,sum(commission_pct) from employees where commission_pct is not null group by department_id;
DEPARTMENT_ID SUM(COMMISSION_PCT)
------------- -------------------
.15
80 7.65
#
SQL> select department_id,job_id,sum(commission_pct) from employees where commission_pct is not null group by department_id,job_id;
DEPARTMENT_ID JOB_ID SUM(COMMISSION_PCT)
------------- -------------------- -------------------
80 SA_REP 6.15
80 SA_MAN 1.5
SA_REP .15
9.2.4 嵌套的分组函数
单行函数可以嵌套或者多层嵌套,而分组函数只能嵌套两层。
- G1(G2(group_item)) = result : 是可以的。 如sum(avg(group_item))。
- G1(G2(G3(group_item)))这样是不行的。 如max(sum(agv(group_item))).
#可以嵌套
SQL> SELECT MAX(AVG(salary)) FROM employees GROUP BY department_id;
MAX(AVG(SALARY))
----------------
19333.3333
SQL> select department_id,avg(salary) from employees GROUP BY department_id;
DEPARTMENT_ID AVG(SALARY)
------------- -----------
100 8600
...
90 19333.3333
110 10150
50 3475.55556
40 6500
80 8955.88235
SQL> SELECT department_id,MAX(AVG(salary)) FROM employees GROUP BY department_id;
SELECT department_id,MAX(AVG(salary)) FROM employees GROUP BY department_id
*
ERROR at line 1:
ORA-00937: not a single-group group function
#各部分last_name的平均长度之和
SQL> select sum(avg(length (last_name))) from employees group by department_id;
SUM(AVG(LENGTH(LAST_NAME)))
---------------------------
69.9261438
SQL> select department_id,avg(length (last_name)) from employees group by department_id;
DEPARTMENT_ID AVG(LENGTH(LAST_NAME))
------------- ----------------------
100 5.83333333
30 6.5
5
20 6
70 4
90 6
110 6
50 6.02222222
40 6
80 5.97058824
10 6
60 6.6
12 rows selected.
9.3 使用Having 子句包含或者排除分组行
9.3.1 限制分组结果
group by 子句将聚合函数应用于组,返回组结果,having子句可以用来限制组级结果。
SQL> select department_id,count(*) from job_history where department_id in (50,60,80,110) group by department_id;
DEPARTMENT_ID COUNT(*)
------------- ----------
50 2
60 1
80 2
110 2
#
SQL> select department_id,count(*) from job_history where department_id in (50,60,80,110) group by department_id having count(*) >1;
DEPARTMENT_ID COUNT(*)
------------- ----------
50 2
80 2
110 2
9.3.2 Having子句
- having 通常在 group by 和order by 之间,但也可能出现在group by 前。
- having 只能出现在有group by的SQL语句中。
SQL> select job_id ,avg(salary),count(*) from employees group by job_id having avg(salary)> 10000 order by job_id;
JOB_ID AVG(SALARY) COUNT(*)
-------------------- ----------- ----------
AC_MGR 12000 1
AD_PRES 24000 1
AD_VP 17000 2
FI_MGR 12000 1
MK_MAN 13000 1
PU_MAN 11000 1
SA_MAN 12200 5
7 rows selected.