MySQL 数据库中删除重复数据的方法

演示数据,仅供参考

查询表结构:

mysql> desc test;
+-------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+----------------+
| id | int(11) unsigned | NO | PRI | NULL | auto_increment |
| site | varchar(100) | NO | MUL | | |
+-------+------------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)

查询数据:

mysql> select * from test order by id;
+----+------------------------+
| id | site |
+----+------------------------+| 1 | http://www.baidu.com |
| 2 | http://www.hao123.com |
| 3 | http://www.huwei.com |
| 4 | http://www.baidu.com |
| 5 | http://www.huwei.com |
+----+------------------------+
5 rows in set (0.00 sec)

当没有创建表或创建索引权限的时候,如果你要删除较旧的重复记录,可以使用下面的语句:

mysql> delete from a
-> using test as a, test as b
-> where (a.id > b.id)
-> and (a.site = b.site);
Query OK, 2 rows affected (0.12 sec) mysql> select * from test order by id;
+----+------------------------+
| id | site |
+----+------------------------+
| 1 | http://www.baidu.com |
| 2 | http://www.hao123.com |
| 3 | http://www.huwei.com |
+----+------------------------+
3 rows in set (0.00 sec)

如果你要删除较新的重复记录,可以使用下面的语句:

mysql> delete from a
-> using test as a, test as b
-> where (a.id < b.id)
-> and (a.site = b.site);
Query OK, 2 rows affected (0.12 sec) mysql> select * from test order by id;
+----+------------------------+
| id | site |
+----+------------------------+
| 2 | http://www.hao123.com |
| 4 | http://www.baidu.com |
| 5 | http://www.huwei.com |
+----+------------------------+
3 rows in set (0.00 sec)

你可以用下面的语句先确认将被删除的重复记录:

mysql> SELECT a.*
-> FROM test a, test b
-> WHERE a.id > b.id
-> AND (a.site = b.site);
+----+------------------------+
| id | site |
+----+------------------------+
| 1 | http://www.baidu.com |
| 3 | http://www.huwei.com |
+----+------------------------+
2 rows in set (0.00 sec)

如果有创建索引的权限,在表上创建唯一键索引,可以用下面的方法:

mysql> alter ignore table test add unique index ukey (site);
Query OK, 5 rows affected (0.46 sec)
Records: 5 Duplicates: 2 Warnings: 0 mysql> select * from test order by id;
+----+------------------------+
| id | site |
+----+------------------------+
| 1 | http://www.baidu.com |
| 2 | http://www.hao123.com |
| 3 | http://www.huwei.com |
+----+------------------------+
3 rows in set (0.00 sec)

重复记录被删除后,如果需要,可以删除索引:

mysql> alter table test drop index ukey;
Query OK, 3 rows affected (0.37 sec)
Records: 3 Duplicates: 0 Warnings: 0

如果有创建表的权限,创建一个新表,然后将原表中不重复的数据插入新表:

mysql> create table test_new as select * from test group by site;
Query OK, 3 rows affected (0.19 sec)
Records: 3 Duplicates: 0 Warnings: 0 mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| test |
| test_new |
+----------------+
2 rows in set (0.00 sec) mysql> select * from test order by id;
+----+------------------------+
| id | site |
+----+------------------------+
| 1 | http://www.baidu.com |
| 2 | http://www.hao123.com |
| 3 | http://www.huwei.com |
| 4 | http://www.baidu.com |
| 5 | http://www.huwei.com |
+----+------------------------+
5 rows in set (0.00 sec) mysql> select * from test_new order by id;
+----+------------------------+
| id | site |
+----+------------------------+
| 1 | http://www.baidu.com |
| 2 | http://www.hao123.com |
| 3 | http://www.huwei.com |
+----+------------------------+
3 rows in set (0.00 sec)

然后将原表备份,将新表重命名为当前表:

mysql> rename table test to test_old, test_new to test;
Query OK, 0 rows affected (0.04 sec) mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| test |
| test_old |
+----------------+
2 rows in set (0.00 sec) mysql> select * from test order by id;
+----+------------------------+
| id | site |
+----+------------------------+
| 1 | http://www.baidu.com |
| 2 | http://www.hao123.com |
| 3 | http://www.huwei.com |
+----+------------------------+
3 rows in set (0.00 sec)

注意:使用这种方式创建的表会丢失原表的索引信息!

mysql> desc test;
+-------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+-------+
| id | int(11) unsigned | NO | | 0 | |
| site | varchar(100) | NO | | | |
+-------+------------------+------+-----+---------+-------+
2 rows in set (0.00 sec)
上一篇:ElasticSearch的基本用法与集群搭建


下一篇:ecshop如何判断缓存文件是否能更新