力扣SQL刷题记录（1）

力扣题目资源来自“熊大的数据分析之路”，写下这系列练习记录是为了自己实践一遍，非商业用途。

1. 组合两个表

表1:Person

+-------------+---------+

| 列名 | 类型 |

+-------------+---------+

| PersonId | int |

| FirstName | varchar |

| LastName | varchar |

+-------------+---------+

PersonId 是上表主键

表2:Address

+-------------+---------+

| 列名 | 类型 |

+-------------+---------+

| AddressId | int |

| PersonId | int |

| City | varchar |

| State | varchar |

+-------------+---------+

AddressId 是上表主键

编写一个 SQL 查询，满足条件：无论 person 是否有地址信息，都需要基于上述两表提供 person 的以下信息：

FirstName, LastName, City, State

select FirstName, LastName, City, State

from Person left join AddressId

where Person.PersonId=Address.PersonId

这道题好基础，以Person为主表，进行左链接即可

2. 第二高的薪水

编写一个 SQL 查询，获取Employee表中第二高的薪水（Salary）。

+----+--------+

| Id | Salary |

+----+--------+

| 1 | 100 |

| 2 | 200 |

| 3 | 300 |

+----+--------+

例如上述Employee表，SQL查询应该返回200作为第二高的薪水。如果不存在第二高的薪水，那么查询应返回null。

+---------------------+

| SecondHighestSalary |

+---------------------+

| 200 |

+---------------------+

我的想法是：在排除掉最高薪资后的数据内，选取其最高薪资，即第二高薪资。所以有了这个代码：

select max(Salary) as SecondHighestSalary

from employee

where salary< (select max(salary) from employee)

另外学习到了一种排序函数的写法：

select min(salary) "SecondHighestSalary"

from(select salary,dense_rank()over(order by salary desc) as rn

from Employee)

where rn=2;

知识点：

※rank() over(partition by A order by B) ：按照A进行分组，分组里面的数据按照B进行排序，over即在什么之上，rank()即跳跃排序；

比如存在两个第一名，接下来就是第三名。

※dense_rank(): 连续排序，如果有两个第一名时，接下来第三个仍然是第二名。

附文章：https://www.jb51.net/article/20631.htm

3. 第N高的薪水

编写一个 SQL 查询，获取Employee表中第n 高的薪水（Salary）。

+----+--------+

| Id | Salary |

+----+--------+

| 1 | 100 |

| 2 | 200 |

| 3 | 300 |

+----+--------+

例如上述Employee表，n = 2 时，应返回第二高的薪水200。如果不存在第n 高的薪水，那么查询应返回null。

+------------------------+

| getNthHighestSalary(2) |

+------------------------+

| 200 |

+------------------------+

解1:

CREATE FUNCTION getNthHighestSalary(@N INT) RETURNS INT AS

BEGIN

RETURN (

/* Write your T-SQL query statement below. */

select

distinct salary

from

(select salary,dense_rank() over (order by salary desc) as rank

from employee) as t

where rank=@N

);

这个解法是先将薪资表进行连续排序（dense_rank函数），然后取题目要求的第N名，最后进行去重。

————————————————

解2：或者可以这样：

Write your MySQL query statement below：

select distinct Salary

from Employee as e1

where N = (select count(distinct e1.Salary)

from Employe e1，e2

where e1.Salary <= e2.Salary )

这个是不用连续排序（dense_rank函数）进行的薪资排序，把两张薪资表进行逐一的比较，把比较后的结果进行计数，比如薪资表有5个不同的数据：1000，500，2000，500，3000.表1内A薪资3000，和表2比较后，表1只有1个小于等于3000的薪资记录，那么count(distinct e1.Salary)则为1，3000这一薪资记录也排名第一;同理薪资500的话，和表2比较后，表1有5个小于等于500的薪资记录，去重后是4个；这个想了好久，枚举数据之后终于懂了 _(:з」∠)_

————————————————

解3：还有一个更简单的，学习到了

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT

BEGIN

declare M INT;

SET M=N-1;

RETURN (

# Write your MySQL query statement below.

select ifnull((select distinct Salary from Employee order by Salary desc limit M,1),null) as getNthHighestSalary

);

END

————————————————

知识点：*IFNULL() 函数用于判断第一个表达式是否为 NULL，如果为 NULL 则返回第二个参数的值，如果不为 NULL 则返回第一个参数的值。

*LIMIT M OFFSET L

select * from table limit m,n

其中m是指记录开始的index，从源0开始，表示第一条记录

n是指从第m+1条开始，取n条。

select * from tablename limit 2,4

即取出第3条至第6条，4条记录

如果想去薪资第三名的数据，那么N=3,M=2,题中即从第M(2)行起，取1行数据，即取第三名薪资记录。

4. 分数排名

编写一个 SQL 查询来实现分数排名。如果两个分数相同，则两个分数排名（Rank）相同。请注意，平分后的下一个名次应该是下一个连续的整数值。换句话说，名次之间不应该有“间隔”。

+----+-------+

| Id | Score |

+----+-------+

| 1 | 3.50 |

| 2 | 3.65 |

| 3 | 4.00 |

| 4 | 3.85 |

| 5 | 4.00 |

| 6 | 3.65 |

+----+-------+

例如，根据上述给定的Scores表，你的查询应该返回（按分数从高到低排列）：

+-------+------+

| Score | Rank |

+-------+------+

| 4.00 | 1 |

| 3.85 | 2 |

| 3.65 | 3 |

| 3.50 | 4 |

+-------+------+

解1：(还是也连续排序函数）

select score as "Score", dense_rank() over(order by score desc) as "Rank"

from Scores;

还有一种解法2：

(select count(distinct b.Score) from Scores b

where b.Score >= a.Score)asRank

from Scores a

order by a.Score DESC;

也是延续上一道题的两表比较后，计数大于等于的记录个数，作为排序

“ 提取出大于等于a.Score的所有分数集合H，将H去重后的元素个数就是a.Score的排名。select a.Score as Score”---大佬的解释好清楚，沿用了

5. 连续出现的数字

编写一个 SQL 查询，查找所有至少连续出现三次的数字。

+----+-----+

| Id | Num |

+----+-----+

| 1 | 1 |

| 2 | 1 |

| 3 | 1 |

| 4 | 2 |

| 5 | 1 |

| 6 | 2 |

| 7 | 2 |

+----+-----+

例如，给定上面的Logs表，1是唯一连续出现至少三次的数字。

+-----------------+

| ConsecutiveNums |

+-----------------+

| 1 |

+-----------------+

解：

select distinct a.Num as"ConsecutiveNums"

from Logsa, Logsb, Logsc

where a.Id=b.Id-1 and a.Id=c.Id-2

and a.Num=b.Num and a.Num=c.Num;

解法就是把logs表*3，再把三张表的ID进行连续三次的where子句筛选，这题应该是可以对应连续登陆、连续购买···这类的情况的。

6. 超过经理收入的员工

Employee表包含所有员工，他们的经理也属于员工。每个员工都有一个 Id，此外还有一列对应员工的经理的 Id。

+----+-------+--------+-----------+

+----+-------+--------+-----------+

| 1 | Joe | 70000 | 3 |

| 2 | Henry | 80000 | 4 |

| 3 | Sam | 60000 | NULL |

| 4 | Max | 90000 | NULL |

+----+-------+--------+-----------+

给定Employee表，编写一个 SQL 查询，该查询可以获取收入超过他们经理的员工的姓名。在上面的表格中，Joe 是唯一一个收入超过他的经理的员工。

+----------+

| Employee |

+----------+

| Joe |

+----------+

解：

select a.Name as "Employee"

from Employee a,Employee b

where a.ManagerId = b.Id and a.Salary > b.Salary;

这道题经常遇到，解法是两表用ManagerId = Id相连接后，再去比较薪资即可。

7. 查找重复的电子邮箱

编写一个 SQL 查询，查找Person表中所有重复的电子邮箱。

示例：

+----+---------+

| Id | Email |

+----+---------+

| 1 | a@b.com |

| 2 | c@d.com |

| 3 | a@b.com |

+----+---------+

根据以上输入，你的查询应返回以下结果：

+---------+

| Email |

+---------+

| a@b.com |

+---------+

说明：所有电子邮箱都是小写字母。

解：select Email"Email"

from Person

group by Email

having count(Email) >1;

简单题，按邮箱分组后，筛选出计数项大于1的记录，就是重复的记录。

8. 从不订购的客户

某网站包含两个表，Customers表和Orders表。编写一个 SQL 查询，找出所有从不订购任何东西的客户。

Customers表：

+----+-------+

| Id | Name |

+----+-------+

| 1 | Joe |

| 2 | Henry |

| 3 | Sam |

| 4 | Max |

+----+-------+

Orders表：

+----+------------+

| Id | CustomerId |

+----+------------+

| 1 | 3 |

| 2 | 1 |

+----+------------+

例如给定上述表格，你的查询应返回：

+-----------+

| Customers |

+-----------+

| Henry |

| Max |

+-----------+

解：从不订购=订单表无记录，无ID，算是一种排除法吧。

select Name as "Customers"

from Customers

where Id not in (select CustomerId from Orders)

order by Name;

9. 部门工资最高的员工

Employee表包含所有员工信息，每个员工有其对应的 Id, salary 和 department Id。

+----+-------+--------+--------------+

+----+-------+--------+--------------+

| 1 | Joe | 70000 | 1 |

| 2 | Henry | 80000 | 2 |

| 3 | Sam | 60000 | 2 |

| 4 | Max | 90000 | 1 |

+----+-------+--------+--------------+

Department表包含公司所有部门的信息。

+----+----------+

| Id | Name |

+----+----------+

| 1 | IT |

| 2 | Sales |

+----+----------+

编写一个 SQL 查询，找出每个部门工资最高的员工。例如，根据上述给定的表格，Max 在 IT 部门有最高工资，Henry 在 Sales 部门有最高工资。

+------------+----------+--------+

| Department | Employee | Salary |

+------------+----------+--------+

| IT | Max | 90000 |

| Sales | Henry | 80000 |

+------------+----------+--------+

解：

select b.Name"Department",a.Name"Employee",Salary"Salary"

from Employee a,Department b

where DepartmentId = b.Id

and(Salary,DepartmentId) in

(select max(Salary),DepartmentId

from Employee

groupby DepartmentId);

先在员工信息表里按部门分组找出最高工资的员工，再连接部门表，显示要输出的信息。

10. 部门工资前三高的员工

Employee表包含所有员工信息，每个员工有其对应的 Id, salary 和 department Id 。

+----+-------+--------+--------------+

+----+-------+--------+--------------+

| 1 | Joe | 70000 | 1 |

| 2 | Henry | 80000 | 2 |

| 3 | Sam | 60000 | 2 |

| 4 | Max | 90000 | 1 |

| 5 | Janet | 69000 | 1 |

| 6 | Randy | 85000 | 1 |

+----+-------+--------+--------------+

Department表包含公司所有部门的信息。

+----+----------+

| Id | Name |

+----+----------+

| 1 | IT |

| 2 | Sales |

+----+----------+

编写一个 SQL 查询，找出每个部门工资前三高的员工。例如，根据上述给定的表格，查询结果应返回：

+------------+----------+--------+

| Department | Employee | Salary |

+------------+----------+--------+

| IT | Max | 90000 |

| IT | Randy | 85000 |

| IT | Joe | 70000 |

| Sales | Henry | 80000 |

| Sales | Sam | 60000 |

+------------+----------+--------+

解：

select d.name"Department",s.name"Employee",s.salary"Salary"

from(select departmentid,name,salary

from

(select departmentid,

name,

salary,

dense_rank()over(partition by departmentid order by salary desc) rn

from employee)

where rn <4) s,department d

where s.departmentid = d.id

思路：每个部门工资前三高的员工，首先用dense_rank()把部门前3高的记录作为表S，再用表s和部门名称表d进行连接，那么每个部门前三高薪资的记录就出来了。如果不支持dense_rank()函数的话，又要用两表比较，计数去重元素个数做排名的方法了。

↑是熊大的举例图

11. 删除重复的电子邮箱

编写一个 SQL 查询，来删除Person表中所有重复的电子邮箱，重复的邮箱里只保留Id 最小的那个。

+----+------------------+

| Id | Email |

+----+------------------+

| 1 | john@example.com |

| 2 | bob@example.com |

| 3 | john@example.com |

+----+------------------+

Id 是这个表的主键。

例如，在运行你的查询语句之后，上面的Person表应返回以下几行:

+----+------------------+

| Id | Email |

+----+------------------+

| 1 | john@example.com |

| 2 | bob@example.com |

+----+------------------+

解：

delete d1

from Person d1,Person d2

where d1.Email = d2.Email and d1.Id > d2.Id;

知识点：删除语句，但是最好实际操作数据库的时候不要删除哦~

DELETE FROM table_name

WHERE some_column=some_value;

12. 上升的温度

给定一个Weather表，编写一个 SQL 查询，来查找与之前（昨天的）日期相比温度更高的所有日期的 Id。

+---------+------------------+------------------+

| Id(INT) | RecordDate(DATE) | Temperature(INT) |

+---------+------------------+------------------+

| 1 | 2015-01-01 | 10 |

| 2 | 2015-01-02 | 25 |

| 3 | 2015-01-03 | 20 |

| 4 | 2015-01-04 | 30 |

+---------+------------------+------------------+

例如，根据上述给定的Weather表格，返回如下 Id:

+----+

| Id |

+----+

| 2 |

| 4 |

+----+

解：

select w1.Id as "Id"

from Weather w1,Weather w2

where w1.RecordDate = w2.RecordDate +1

and w1.Temperature > w2.Temperature;

老办法了，日期的连续要用+1来连接，然后在比较量表中气温即可。

不过要注意日期类型的字段，是可以这样计算的。

13. 行程和用户（重点复习，比较复杂）

Trips表中存所有出租车的行程信息。每段行程有唯一键 Id，Client_Id 和 Driver_Id 是Users表中 Users_Id 的外键。Status 是枚举类型，枚举成员为 (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’)。

+----+-----------+-----------+---------+--------------------+----------+

+----+-----------+-----------+---------+--------------------+----------+

| 1 | 1 | 10 | 1 | completed |2013-10-01|

| 2 | 2 | 11 | 1 | cancelled_by_driver|2013-10-01|

| 3 | 3 | 12 | 6 | completed |2013-10-01|

| 4 | 4 | 13 | 6 | cancelled_by_client|2013-10-01|

| 5 | 1 | 10 | 1 | completed |2013-10-02|

| 6 | 2 | 11 | 6 | completed |2013-10-02|

| 7 | 3 | 12 | 6 | completed |2013-10-02|

| 8 | 2 | 12 | 12 | completed |2013-10-03|

| 9 | 3 | 10 | 12 | completed |2013-10-03|

| 10 | 4 | 13 | 12 | cancelled_by_driver|2013-10-03|

+----+-----------+-----------+---------+--------------------+----------+

Users表存所有用户。每个用户有唯一键 Users_Id。Banned 表示这个用户是否被禁止，Role 则是一个表示（‘client’, ‘driver’, ‘partner’）的枚举类型。

+----------+--------+--------+

| Users_Id | Banned | Role |

+----------+--------+--------+

| 1 | No | client |

| 2 | Yes | client |

| 3 | No | client |

| 4 | No | client |

| 10 | No | driver |

| 11 | No | driver |

| 12 | No | driver |

| 13 | No | driver |

+----------+--------+--------+

写一段 SQL 语句查出2013年10月1日至2013年10月3日期间非禁止用户的取消率。基于上表，你的 SQL 语句应返回如下结果，取消率（Cancellation Rate）保留两位小数。

+------------+-------------------+

| Day | Cancellation Rate |

+------------+-------------------+

| 2013-10-01 | 0.33 |

| 2013-10-02 | 0.00 |

| 2013-10-03 | 0.50 |

+------------+-------------------+

解：

整体思路是这样：

①限定日期2013年10月1日至2013年10月3日,要按天进行分组并排序；

②非禁止用户，即Users表Banned字段为NO；

③取消率=取消次数/行程总数，还要进行小数点两位的保留；另外取消可能是用户，也可能是司机

select t2.Request_at as 'Day', ROUND(if_null(cancel,0)/count(t.id),2) as "Cancellation Rate"

from ( select t.Request_at,count(t.Status) as cancel

from trips as t

inner join users c

on c.Users_Id = t.Client_Id

inner join users d

d.Users_Id = t.Driver_Id

where t.status !='completed'

and t.Request_at between '2013-10-01' and '2013-10-03'

and c.Banned='No' and d.Banned='No') as t2

group by t2.Request_at

order by t2.Request_at

————————————————

好晕_(¦3」∠)_ 大致思路如上，

有一些坑：

1、比如2013/10/2这一天实际上没有取消订单的，不套用if_null的话，取消率算不出来

2、看了一些参考答案，最终我选择用内连接，感觉比较好懂

3、总结下来，要一步步拆解调试，嵌套再嵌套

刷了13道题，继续下面的13道题吧，这次对排序函数和LIMIT有了一些新的认识！

力扣SQL刷题记录（1）

友情链接更多精彩内容