MySQL技巧：处理重复数据

本文案例以MySQL5.7作为数据库环境。

重复数据产生的原因有多种，比如系统存在bug、重复提交、需求调整（原来允许重复的内容现在不允许重复了）... 原因就不一一列觉了，这里用实例来分析怎么解决重复数据的问题。

在另一篇《MySQL实战》的用户表中准备以下数据

mysql> select id,username,mobile from t_user;
+-------+----------+-------------+
| id    | username | mobile      |
+-------+----------+-------------+
| 10001 | user1    | 13900000001 |
| 10002 | user2    | NULL        |
| 10003 | user3    | NULL        |
| 10004 | user4    | NULL        |
| 10005 | user5    | NULL        |
| 10006 | user6    | 13900000001 |
+-------+----------+-------------+

现在需要检查用户表中手机号mobile重复的数据，可以利用聚合函数count()按mobile字段group by找到需要的结果。

# 查询找到出现重复的手机号
mysql> select mobile,count(1) as c from t_user where mobile is not null group by mobile having c > 1;
+-------------+---+
| mobile      | c |
+-------------+---+
| 13900000001 | 2 |
+-------------+---+

接下来根据需要对重复的手机号进行处理，比如将id较大的记录中的手机号设为null。

我们按照要求一步一步来完善上面的sql，既然要对id较大的记录处理，那么久需要找到id最小的记录

# mim(id)
mysql> select mobile,count(1) as c,min(id) as min_id from t_user where mobile is not null group by mobile having c > 1;
+-------------+---+--------+
| mobile      | c | min_id |
+-------------+---+--------+
| 13900000001 | 2 |  10001 |
+-------------+---+--------+

找到最小id后，将t_user与查询结果join，执行update动作。

# update ... where id > ...
mysql> update t_user as u
    -> join (
    -> select mobile,count(1) as c,min(id) as min_id from t_user where mobile is not null group by mobile having c > 1
    -> ) as a on u.mobile = a.mobile
    -> set mobile = null
    -> where u.id > a.min_id;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

提示执行成功，最后检查下是否达到预期效果。

# 查询是否存在mobile重复的记录
mysql> select mobile,count(1) as c from t_user where mobile is not null group by mobile having c > 1;
Empty set (0.00 sec)
# 再通过直观方式再次验证
mysql> select id,username,mobile from t_user;
+-------+----------+-------------+
| id    | username | mobile      |
+-------+----------+-------------+
| 10001 | user1    | 13900000001 |
| 10002 | user2    | NULL        |
| 10003 | user3    | NULL        |
| 10004 | user4    | NULL        |
| 10005 | user5    | NULL        |
| 10006 | user6    | NULL        |
+-------+----------+-------------+
6 rows in set (0.00 sec)

最后编辑于：2017.12.07 05:19:28

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成，浏览时请结合常识与多方信息审慎甄别。
平台声明：文章内容（如有图片或视频亦包括在内）由作者上传并发布，文章内容仅代表作者本人观点，简书系信息发布平台，仅提供信息存储服务。

MySQL技巧：处理重复数据

MySQL技巧：处理重复数据

相关阅读更多精彩内容

友情链接更多精彩内容