谨慎使用MYSQL表分区

test1表是未分区

test2表是hash分区

以下是两个表的表结构:

mysql> show create table test1;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                      |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| test1 | CREATE TABLE `test1` (
  `id` int(11) NOT NULL,
  `tid` int(11) DEFAULT NULL,
  `name` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `tid` (`tid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

 

mysql> show create table test2;
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                       |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| test2 | CREATE TABLE `test2` (
  `id` int(11) NOT NULL,
  `tid` int(11) DEFAULT NULL,
  `name` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `tid` (`tid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY HASH (id)
PARTITIONS 10 */ |

+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

在主键id列上用上hash分区。

 

如果是在主键上查询,那么分区要比不分区性能高,例如:

mysql> explain partitions select * from test1 where id in (86,777,432345);
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | test1 | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL |    3 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.06 sec)

mysql> explain partitions select * from test2 where id in (86,777,432345); 
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | test2 | p5,p6,p7   | range | PRIMARY       | PRIMARY | 4       | NULL |    3 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.01 sec)
 

只在分区P5/6/7上就扫描到了记录,减少了过多的磁盘IO。

但如果在索引tid上查询,性能就很低了。

mysql>  explain partitions select * from test1 where tid = 86419804; 
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+-------------+
|  1 | SIMPLE      | test1 | NULL       | ref  | tid           | tid  | 5       | const |    1 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+-------------+
1 row in set (0.00 sec)

mysql>  explain partitions select * from test2 where tid = 86419804; 
+----+-------------+-------+-------------------------------+------+---------------+------+---------+-------+------+-------------+
| id | select_type | table | partitions                    | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+----+-------------+-------+-------------------------------+------+---------------+------+---------+-------+------+-------------+
|  1 | SIMPLE      | test2 | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9 | ref  | tid           | tid  | 5       | const |   10 | Using where |
+----+-------------+-------+-------------------------------+------+---------------+------+---------+-------+------+-------------+
1 row in set (0.00 sec)
 

要扫描10个分区,才能得到结果。

mysql> select sql_no_cache * from test1 where tid=86419804; 
+----+----------+------+
| id | tid      | name |
+----+----------+------+
| 11 | 86419804 | abc2 |
+----+----------+------+
1 row in set (0.40 sec)

mysql> select sql_no_cache * from test2 where tid=86419804; 
+----+----------+------+
| id | tid      | name |
+----+----------+------+
| 11 | 86419804 | abc2 |
+----+----------+------+
1 row in set (1.98 sec)

分区后的查询结果要比未分区的慢很多。

结论:分区规则仅限制在主键上使用,性能会有提高,在其他字段上使用,即使有索引,性能也会很差。

 



本文转自 liang3391 51CTO博客,原文链接:http://blog.51cto.com/liang3391/844523

上一篇:《Hadoop MapReduce实战手册》一1.5 安装HDFS


下一篇:Zookeeper、Hadoop、Sqoop、Mahout、HBase整合安装