Every InnoDB table has a special index called the clustered index where the data for the rows is stored. Typically, the clustered index is synonymous with the primary key. To get the best performance from queries, inserts, and other database operations, you must understand how InnoDB uses the clustered index to optimize the most common lookup and DML operations for each table.
每张使用Innodb作为存储引擎的表都有一个特殊的索引,聚簇索引,它保存着每行数据。一般情况,聚簇索引就是主键索引。为了得到最高效的查询,插入,或者其他的数据库操作,你必须理解innodb引擎如何使用聚簇索引来优化大多数查询和dml操作的。
If you define a PRIMARY KEY on your table, InnoDB uses it as the clustered index.
If you do not define a PRIMARY KEY for your table, MySQL picks the first UNIQUE index that has only NOT NULLcolumns as the primary key and InnoDB uses it as the clustered index.
If the table has no PRIMARY KEY or suitable UNIQUE index, InnoDB internally generates a hidden clustered index on a synthetic column containing row ID values. The rows are ordered by the ID that InnoDB assigns to the rows in such a table. The row ID is a 6-byte field that increases monotonically as new rows are inserted. Thus, the rows ordered by the row ID are physically in insertion order.
如果你为表定义了一个主键,innodb就使用它作为聚簇索引。
如果你没有定义主键,mysql选择非空类型的唯一索引来作为作为主键,并且innodb会用它作为聚簇索引。
如果表中既没有主键,又没有合适的唯一索引,innodb内部生成一个隐式聚簇索引,建在由rowid组成的虚拟列上。在这张表中,innodb为每行数据指定一个rowid,数据行根据ID来排序。这些row id是由一些占6字节空间的自增长列组成。当有新数据插入的时候,row id增长,这样,保证row id就是按照数据的物理写入顺序来组织行。
How the Clustered Index Speeds Up Queries
聚簇索引提高了查询效率?
Accessing a row through the clustered index is fast because the row data is on the same page where the index search leads. If a table is large, the clustered index architecture often saves a disk I/O operation when compared to storage organizations that store row data using a different page from the index record. (For example, MyISAMuses one file for data rows and another for index records.)
通过聚簇索引来寻找一行数据是非常快的,这是因为行数据和索引在同一页上。如果表特别大,聚簇索引的这种构造就能节省磁盘I/O资源(索引和数据在不同页上时,根据索引来寻找数据存储页消耗的IO),比如myisam引擎,把索引和数据页分开存放;
How Secondary Indexes Relate to the Clustered Index
二级索引和聚簇索引之间是如何关联的?
All indexes other than the clustered index are known as secondary indexes. In InnoDB, each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index. InnoDB uses this primary key value to search for the row in the clustered index.
除了聚簇索引之外的其他索引类型都属于二级索引。在Innodb中,二级索引的每行数据都包含这条数据的主键列,还有二级索引指定的列;聚簇索引中,innodb通过主键值来查找数据行。
If the primary key is long, the secondary indexes use more space, so it is advantageous to have a short primary key.
如果主键过长,二级索引就需要更大的空间,因此,使用短的主键列是非常有用的。