15.9 InnoDB Table and Page Compression
This section provides information about the InnoDB
table compression and InnoDB
page compression features. The page compression feature is also referred to as transparent page compression.
Using the compression features of InnoDB
, you can create tables where the data is stored in compressed form. Compression can help to improve both raw performance and scalability. The compression means less data is transferred between disk and memory, and takes up less space on disk and in memory. The benefits are amplified for tables with secondary indexes, because index data is compressed also. Compression can be especially important for SSD storage devices, because they tend to have lower capacity than HDD devices.
15.9.1 InnoDB Table Compression
- 15.9.1.1 Overview of Table Compression
- 15.9.1.2 Creating Compressed Tables
- 15.9.1.3 Tuning Compression for InnoDB Tables
- 15.9.1.4 Monitoring InnoDB Table Compression at Runtime
- 15.9.1.5 How Compression Works for InnoDB Tables
- 15.9.1.6 Compression for OLTP Workloads
- 15.9.1.7 SQL Compression Syntax Warnings and Errors
This section describes InnoDB
table compression, which is supported with InnoDB
tables that reside in file_per_table tablespaces or general tablespaces. Table compression is enabled using the ROW_FORMAT=COMPRESSED
attribute with CREATE TABLE
or ALTER TABLE
.
15.9.1.1 Overview of Table Compression
Because processors and cache memories have increased in speed more than disk storage devices, many workloads are disk-bound. Data compression enables smaller database size, reduced I/O, and improved throughput, at the small cost of increased CPU utilization. Compression is especially valuable for read-intensive applications, on systems with enough RAM to keep frequently used data in memory.
An InnoDB
table created with ROW_FORMAT=COMPRESSED
can use a smaller page size on disk than the configured innodb_page_size
value. Smaller pages require less I/O to read from and write to disk, which is especially valuable for SSD devices.
The compressed page size is specified through the CREATE TABLE
or ALTER TABLE
KEY_BLOCK_SIZE
parameter. The different page size requires that the table be placed in a file-per-table tablespace or general tablespace rather than in the system tablespace, as the system tablespace cannot store compressed tables. For more information, see Section 15.6.3.2, “File-Per-Table Tablespaces”, and Section 15.6.3.3, “General Tablespaces”.
The level of compression is the same regardless of the KEY_BLOCK_SIZE
value. As you specify smaller values for KEY_BLOCK_SIZE
, you get the I/O benefits of increasingly smaller pages. But if you specify a value that is too small, there is additional overhead to reorganize the pages when data values cannot be compressed enough to fit multiple rows in each page. There is a hard limit on how small KEY_BLOCK_SIZE
can be for a table, based on the lengths of the key columns for each of its indexes. Specify a value that is too small, and the CREATE TABLE
or ALTER TABLE
statement fails.
In the buffer pool, the compressed data is held in small pages, with a page size based on the KEY_BLOCK_SIZE
value. For extracting or updating the column values, MySQL also creates an uncompressed page in the buffer pool with the uncompressed data. Within the buffer pool, any updates to the uncompressed page are also re-written back to the equivalent compressed page. You might need to size your buffer pool to accommodate the additional data of both compressed and uncompressed pages, although the uncompressed pages are evicted from the buffer pool when space is needed, and then uncompressed again on the next access.
15.9.1.2 Creating Compressed Tables
Compressed tables can be created in file-per-table tablespaces or in general tablespaces. Table compression is not available for the InnoDB system tablespace. The system tablespace (space 0, the .ibdata files) can contain user-created tables, but it also contains internal system data, which is never compressed. Thus, compression applies only to tables (and indexes) stored in file-per-table or general tablespaces.
Creating a Compressed Table in File-Per-Table Tablespace
To create a compressed table in a file-per-table tablespace, innodb_file_per_table
must be enabled (the default). You can set this parameter in the MySQL configuration file (my.cnf
or my.ini
) or dynamically, using a SET
statement.
After the innodb_file_per_table
option is configured, specify the ROW_FORMAT=COMPRESSED
clause or KEY_BLOCK_SIZE
clause, or both, in a CREATE TABLE
or ALTER TABLE
statement to create a compressed table in a file-per-table tablespace.
For example, you might use the following statements:
SET GLOBAL innodb_file_per_table=1; CREATE TABLE t1 (c1 INT PRIMARY KEY) ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8;
Creating a Compressed Table in a General Tablespace
To create a compressed table in a general tablespace, FILE_BLOCK_SIZE
must be defined for the general tablespace, which is specified when the tablespace is created. The FILE_BLOCK_SIZE
value must be a valid compressed page size in relation to the innodb_page_size
value, and the page size of the compressed table, defined by the CREATE TABLE
or ALTER TABLE
KEY_BLOCK_SIZE
clause, must be equal to FILE_BLOCK_SIZE/1024
. For example, if innodb_page_size=16384
and FILE_BLOCK_SIZE=8192
, the KEY_BLOCK_SIZE
of the table must be 8. For more information, see Section 15.6.3.3, “General Tablespaces”.
The following example demonstrates creating a general tablespace and adding a compressed table. The example assumes a default innodb_page_size
of 16K. The FILE_BLOCK_SIZE
of 8192 requires that the compressed table have a KEY_BLOCK_SIZE
of 8.
mysql>CREATE TABLESPACE `ts2` ADD DATAFILE 'ts2.ibd' FILE_BLOCK_SIZE = 8192 Engine=InnoDB;
mysql>CREATE TABLE t4 (c1 INT PRIMARY KEY) TABLESPACE ts2 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8;
Notes
-
As of MySQL 8.0, the tablespace file for a compressed table is created using the physical page size instead of the
InnoDB
page size, which makes the initial size of a tablespace file for an empty compressed table smaller than in previous MySQL releases. -
If you specify
ROW_FORMAT=COMPRESSED
, you can omitKEY_BLOCK_SIZE
; theKEY_BLOCK_SIZE
setting defaults to half theinnodb_page_size
value. -
If you specify a valid
KEY_BLOCK_SIZE
value, you can omitROW_FORMAT=COMPRESSED
; compression is enabled automatically. -
To determine the best value for
KEY_BLOCK_SIZE,
typically you create several copies of the same table with different values for this clause, then measure the size of the resulting.ibd
files and see how well each performs with a realistic workload. For general tablespaces, keep in mind that dropping a table does not reduce the size of the general tablespace.ibd
file, nor does it return disk space to the operating system. For more information, see Section 15.6.3.3, “General Tablespaces”. -
The
KEY_BLOCK_SIZE
value is treated as a hint; a different size could be used byInnoDB
if necessary. For file-per-table tablespaces, theKEY_BLOCK_SIZE
can only be less than or equal to theinnodb_page_size
value. If you specify a value greater than theinnodb_page_size
value, the specified value is ignored, a warning is issued, andKEY_BLOCK_SIZE
is set to half of theinnodb_page_size
value. Ifinnodb_strict_mode=ON
, specifying an invalidKEY_BLOCK_SIZE
value returns an error. For general tablespaces, validKEY_BLOCK_SIZE
values depend on theFILE_BLOCK_SIZE
setting of the tablespace. For more information, see Section 15.6.3.3, “General Tablespaces”. -
InnoDB
supports 32KB and 64KB page sizes but these page sizes do not support compression. For more information, refer to theinnodb_page_size
documentation. -
The default uncompressed size of
InnoDB
data pages is 16KB. Depending on the combination of option values, MySQL uses a page size of 1KB, 2KB, 4KB, 8KB, or 16KB for the tablespace data file (.ibd
file). The actual compression algorithm is not affected by theKEY_BLOCK_SIZE
value; the value determines how large each compressed chunk is, which in turn affects how many rows can be packed into each compressed page. -
When creating a compressed table in a file-per-table tablespace, setting
KEY_BLOCK_SIZE
equal to theInnoDB
page size does not typically result in much compression. For example, settingKEY_BLOCK_SIZE=16
typically would not result in much compression, since the normalInnoDB
page size is 16KB. This setting may still be useful for tables with many longBLOB
,VARCHAR
orTEXT
columns, because such values often do compress well, and might therefore require fewer overflow pages as described in Section 15.9.1.5, “How Compression Works for InnoDB Tables”. For general tablespaces, aKEY_BLOCK_SIZE
value equal to theInnoDB
page size is not permitted. For more information, see Section 15.6.3.3, “General Tablespaces”. -
All indexes of a table (including the clustered index) are compressed using the same page size, as specified in the
CREATE TABLE
orALTER TABLE
statement. Table attributes such asROW_FORMAT
andKEY_BLOCK_SIZE
are not part of theCREATE INDEX
syntax forInnoDB
tables, and are ignored if they are specified (although, if specified, they will appear in the output of theSHOW CREATE TABLE
statement). -
For performance-related configuration options, see Section 15.9.1.3, “Tuning Compression for InnoDB Tables”.
Restrictions on Compressed Tables
-
Compressed tables cannot be stored in the
InnoDB
system tablespace. -
General tablespaces can contain multiple tables, but compressed and uncompressed tables cannot coexist within the same general tablespace.
-
Compression applies to an entire table and all its associated indexes, not to individual rows, despite the clause name
ROW_FORMAT
. -
InnoDB
does not support compressed temporary tables. Wheninnodb_strict_mode
is enabled (the default),CREATE TEMPORARY TABLE
returns errors ifROW_FORMAT=COMPRESSED
orKEY_BLOCK_SIZE
is specified. Ifinnodb_strict_mode
is disabled, warnings are issued and the temporary table is created using a non-compressed row format. The same restrictions apply toALTER TABLE
operations on temporary tables.
15.9.1.3 Tuning Compression for InnoDB Tables
Most often, the internal optimizations described in InnoDB Data Storage and Compression ensure that the system runs well with compressed data. However, because the efficiency of compression depends on the nature of your data, you can make decisions that affect the performance of compressed tables:
-
Which tables to compress.
-
What compressed page size to use.
-
Whether to adjust the size of the buffer pool based on run-time performance characteristics, such as the amount of time the system spends compressing and uncompressing data. Whether the workload is more like a data warehouse (primarily queries) or an OLTP system (mix of queries and DML).
-
If the system performs DML operations on compressed tables, and the way the data is distributed leads to expensive compression failures at runtime, you might adjust additional advanced configuration options.
Use the guidelines in this section to help make those architectural and configuration choices. When you are ready to conduct long-term testing and put compressed tables into production, see Section 15.9.1.4, “Monitoring InnoDB Table Compression at Runtime” for ways to verify the effectiveness of those choices under real-world conditions.
When to Use Compression
In general, compression works best on tables that include a reasonable number of character string columns and where the data is read far more often than it is written. Because there are no guaranteed ways to predict whether or not compression benefits a particular situation, always test with a specific workload and data set running on a representative configuration. Consider the following factors when deciding which tables to compress.
Data Characteristics and Compression
A key determinant of the efficiency of compression in reducing the size of data files is the nature of the data itself. Recall that compression works by identifying repeated strings of bytes in a block of data. Completely randomized data is the worst case. Typical data often has repeated values, and so compresses effectively. Character strings often compress well, whether defined in CHAR
, VARCHAR
, TEXT
or BLOB
columns. On the other hand, tables containing mostly binary data (integers or floating point numbers) or data that is previously compressed (for example JPEG or PNG images) may not generally compress well, significantly or at all.
You choose whether to turn on compression for each InnoDB table. A table and all of its indexes use the same (compressed) page size. It might be that the primary key (clustered) index, which contains the data for all columns of a table, compresses more effectively than the secondary indexes. For those cases where there are long rows, the use of compression might result in long column values being stored “off-page”, as discussed in DYNAMIC Row Format. Those overflow pages may compress well. Given these considerations, for many applications, some tables compress more effectively than others, and you might find that your workload performs best only with a subset of tables compressed.
To determine whether or not to compress a particular table, conduct experiments. You can get a rough estimate of how efficiently your data can be compressed by using a utility that implements LZ77 compression (such as gzip
or WinZip) on a copy of the .ibd file for an uncompressed table. You can expect less compression from a MySQL compressed table than from file-based compression tools, because MySQL compresses data in chunks based on the page size, 16KB by default. In addition to user data, the page format includes some internal system data that is not compressed. File-based compression utilities can examine much larger chunks of data, and so might find more repeated strings in a huge file than MySQL can find in an individual page.
Another way to test compression on a specific table is to copy some data from your uncompressed table to a similar, compressed table (having all the same indexes) in a file-per-table tablespace and look at the size of the resulting .ibd
file. For example:
USE test; SET GLOBAL innodb_file_per_table=1; SET GLOBAL autocommit=0; -- Create an uncompressed table with a million or two rows. CREATE TABLE big_table AS SELECT * FROM information_schema.columns; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; COMMIT; ALTER TABLE big_table ADD id int unsigned NOT NULL PRIMARY KEY auto_increment; SHOW CREATE TABLE big_table\G select count(id) from big_table; -- Check how much space is needed for the uncompressed table. \! ls -l data/test/big_table.ibd CREATE TABLE key_block_size_4 LIKE big_table; ALTER TABLE key_block_size_4 key_block_size=4 row_format=compressed; INSERT INTO key_block_size_4 SELECT * FROM big_table; commit; -- Check how much space is needed for a compressed table -- with particular compression settings. \! ls -l data/test/key_block_size_4.ibd
This experiment produced the following numbers, which of course could vary considerably depending on your table structure and data:
-rw-rw---- 1 cirrus staff 310378496 Jan 9 13:44 data/test/big_table.ibd -rw-rw---- 1 cirrus staff 83886080 Jan 9 15:10 data/test/key_block_size_4.ibd
To see whether compression is efficient for your particular workload:
-
For simple tests, use a MySQL instance with no other compressed tables and run queries against the
INFORMATION_SCHEMA.INNODB_CMP
table. -
For more elaborate tests involving workloads with multiple compressed tables, run queries against the
INFORMATION_SCHEMA.INNODB_CMP_PER_INDEX
table. Because the statistics in theINNODB_CMP_PER_INDEX
table are expensive to collect, you must enable the configuration optioninnodb_cmp_per_index_enabled
before querying that table, and you might restrict such testing to a development server or a non-critical slave server. -
Run some typical SQL statements against the compressed table you are testing.
-
Examine the ratio of successful compression operations to overall compression operations by querying the
INFORMATION_SCHEMA.INNODB_CMP
orINFORMATION_SCHEMA.INNODB_CMP_PER_INDEX
table, and comparingCOMPRESS_OPS
toCOMPRESS_OPS_OK
. -
If a high percentage of compression operations complete successfully, the table might be a good candidate for compression.
-
If you get a high proportion of compression failures, you can adjust
innodb_compression_level
,innodb_compression_failure_threshold_pct
, andinnodb_compression_pad_pct_max
options as described in Section 15.9.1.6, “Compression for OLTP Workloads”, and try further tests.
Database Compression versus Application Compression
Decide whether to compress data in your application or in the table; do not use both types of compression for the same data. When you compress the data in the application and store the results in a compressed table, extra space savings are extremely unlikely, and the double compression just wastes CPU cycles.
Compressing in the Database
When enabled, MySQL table compression is automatic and applies to all columns and index values. The columns can still be tested with operators such as LIKE
, and sort operations can still use indexes even when the index values are compressed. Because indexes are often a significant fraction of the total size of a database, compression could result in significant savings in storage, I/O or processor time. The compression and decompression operations happen on the database server, which likely is a powerful system that is sized to handle the expected load.
Compressing in the Application
If you compress data such as text in your application, before it is inserted into the database, You might save overhead for data that does not compress well by compressing some columns and not others. This approach uses CPU cycles for compression and uncompression on the client machine rather than the database server, which might be appropriate for a distributed application with many clients, or where the client machine has spare CPU cycles.
Hybrid Approach
Of course, it is possible to combine these approaches. For some applications, it may be appropriate to use some compressed tables and some uncompressed tables. It may be best to externally compress some data (and store it in uncompressed tables) and allow MySQL to compress (some of) the other tables in the application. As always, up-front design and real-life testing are valuable in reaching the right decision.
Workload Characteristics and Compression
In addition to choosing which tables to compress (and the page size), the workload is another key determinant of performance. If the application is dominated by reads, rather than updates, fewer pages need to be reorganized and recompressed after the index page runs out of room for the per-page “modification log” that MySQL maintains for compressed data. If the updates predominantly change non-indexed columns or those containing BLOB
s or large strings that happen to be stored “off-page”, the overhead of compression may be acceptable. If the only changes to a table are INSERT
s that use a monotonically increasing primary key, and there are few secondary indexes, there is little need to reorganize and recompress index pages. Since MySQL can “delete-mark” and delete rows on compressed pages “in place” by modifying uncompressed data, DELETE
operations on a table are relatively efficient.
For some environments, the time it takes to load data can be as important as run-time retrieval. Especially in data warehouse environments, many tables may be read-only or read-mostly. In those cases, it might or might not be acceptable to pay the price of compression in terms of increased load time, unless the resulting savings in fewer disk reads or in storage cost is significant.
Fundamentally, compression works best when the CPU time is available for compressing and uncompressing data. Thus, if your workload is I/O bound, rather than CPU-bound, you might find that compression can improve overall performance. When you test your application performance with different compression configurations, test on a platform similar to the planned configuration of the production system.
Configuration Characteristics and Compression
Reading and writing database pages from and to disk is the slowest aspect of system performance. Compression attempts to reduce I/O by using CPU time to compress and uncompress data, and is most effective when I/O is a relatively scarce resource compared to processor cycles.
This is often especially the case when running in a multi-user environment with fast, multi-core CPUs. When a page of a compressed table is in memory, MySQL often uses additional memory, typically 16KB, in the buffer pool for an uncompressed copy of the page. The adaptive LRU algorithm attempts to balance the use of memory between compressed and uncompressed pages to take into account whether the workload is running in an I/O-bound or CPU-bound manner. Still, a configuration with more memory dedicated to the buffer pool tends to run better when using compressed tables than a configuration where memory is highly constrained.
Choosing the Compressed Page Size
The optimal setting of the compressed page size depends on the type and distribution of data that the table and its indexes contain. The compressed page size should always be bigger than the maximum record size, or operations may fail as noted in Compression of B-Tree Pages.
Setting the compressed page size too large wastes some space, but the pages do not have to be compressed as often. If the compressed page size is set too small, inserts or updates may require time-consuming recompression, and the B-tree nodes may have to be split more frequently, leading to bigger data files and less efficient indexing.
Typically, you set the compressed page size to 8K or 4K bytes. Given that the maximum row size for an InnoDB table is around 8K, KEY_BLOCK_SIZE=8
is usually a safe choice.
15.9.1.4 Monitoring InnoDB Table Compression at Runtime
Overall application performance, CPU and I/O utilization and the size of disk files are good indicators of how effective compression is for your application. This section builds on the performance tuning advice from Section 15.9.1.3, “Tuning Compression for InnoDB Tables”, and shows how to find problems that might not turn up during initial testing.
To dig deeper into performance considerations for compressed tables, you can monitor compression performance at runtime using the Information Schema tables described in Example 15.1, “Using the Compression Information Schema Tables”. These tables reflect the internal use of memory and the rates of compression used overall.
The INNODB_CMP
table reports information about compression activity for each compressed page size (KEY_BLOCK_SIZE
) in use. The information in these tables is system-wide: it summarizes the compression statistics across all compressed tables in your database. You can use this data to help decide whether or not to compress a table by examining these tables when no other compressed tables are being accessed. It involves relatively low overhead on the server, so you might query it periodically on a production server to check the overall efficiency of the compression feature.
The INNODB_CMP_PER_INDEX
table reports information about compression activity for individual tables and indexes. This information is more targeted and more useful for evaluating compression efficiency and diagnosing performance issues one table or index at a time. (Because that each InnoDB
table is represented as a clustered index, MySQL does not make a big distinction between tables and indexes in this context.) The INNODB_CMP_PER_INDEX
table does involve substantial overhead, so it is more suitable for development servers, where you can compare the effects of different workloads, data, and compression settings in isolation. To guard against imposing this monitoring overhead by accident, you must enable the innodb_cmp_per_index_enabled
configuration option before you can query the INNODB_CMP_PER_INDEX
table.
The key statistics to consider are the number of, and amount of time spent performing, compression and uncompression operations. Since MySQL splits B-tree nodes when they are too full to contain the compressed data following a modification, compare the number of “successful” compression operations with the number of such operations overall. Based on the information in the INNODB_CMP
and INNODB_CMP_PER_INDEX
tables and overall application performance and hardware resource utilization, you might make changes in your hardware configuration, adjust the size of the buffer pool, choose a different page size, or select a different set of tables to compress.
If the amount of CPU time required for compressing and uncompressing is high, changing to faster or multi-core CPUs can help improve performance with the same data, application workload and set of compressed tables. Increasing the size of the buffer pool might also help performance, so that more uncompressed pages can stay in memory, reducing the need to uncompress pages that exist in memory only in compressed form.
A large number of compression operations overall (compared to the number of INSERT
, UPDATE
and DELETE
operations in your application and the size of the database) could indicate that some of your compressed tables are being updated too heavily for effective compression. If so, choose a larger page size, or be more selective about which tables you compress.
If the number of “successful” compression operations (COMPRESS_OPS_OK
) is a high percentage of the total number of compression operations (COMPRESS_OPS
), then the system is likely performing well. If the ratio is low, then MySQL is reorganizing, recompressing, and splitting B-tree nodes more often than is desirable. In this case, avoid compressing some tables, or increase KEY_BLOCK_SIZE
for some of the compressed tables. You might turn off compression for tables that cause the number of “compression failures” in your application to be more than 1% or 2% of the total. (Such a failure ratio might be acceptable during a temporary operation such as a data load).
15.9.1.5 How Compression Works for InnoDB Tables
This section describes some internal implementation details about compression for InnoDB tables. The information presented here may be helpful in tuning for performance, but is not necessary to know for basic use of compression.
Compression Algorithms
Some operating systems implement compression at the file system level. Files are typically divided into fixed-size blocks that are compressed into variable-size blocks, which easily leads into fragmentation. Every time something inside a block is modified, the whole block is recompressed before it is written to disk. These properties make this compression technique unsuitable for use in an update-intensive database system.
MySQL implements compression with the help of the well-known zlib library, which implements the LZ77 compression algorithm. This compression algorithm is mature, robust, and efficient in both CPU utilization and in reduction of data size. The algorithm is “lossless”, so that the original uncompressed data can always be reconstructed from the compressed form. LZ77 compression works by finding sequences of data that are repeated within the data to be compressed. The patterns of values in your data determine how well it compresses, but typical user data often compresses by 50% or more.
NoteInnoDB
supports the zlib
library up to version 1.2.11, which is the version bundled with MySQL 8.0.
Unlike compression performed by an application, or compression features of some other database management systems, InnoDB compression applies both to user data and to indexes. In many cases, indexes can constitute 40-50% or more of the total database size, so this difference is significant. When compression is working well for a data set, the size of the InnoDB data files (the file-per-table tablespace or general tablespace .ibd
files) is 25% to 50% of the uncompressed size or possibly smaller. Depending on the workload, this smaller database can in turn lead to a reduction in I/O, and an increase in throughput, at a modest cost in terms of increased CPU utilization. You can adjust the balance between compression level and CPU overhead by modifying the innodb_compression_level
configuration option.
InnoDB Data Storage and Compression
All user data in InnoDB tables is stored in pages comprising a B-tree index (the clustered index). In some other database systems, this type of index is called an “index-organized table”. Each row in the index node contains the values of the (user-specified or system-generated) primary key and all the other columns of the table.
Secondary indexes in InnoDB tables are also B-trees, containing pairs of values: the index key and a pointer to a row in the clustered index. The pointer is in fact the value of the primary key of the table, which is used to access the clustered index if columns other than the index key and primary key are required. Secondary index records must always fit on a single B-tree page.
The compression of B-tree nodes (of both clustered and secondary indexes) is handled differently from compression of overflow pages used to store long VARCHAR
, BLOB
, or TEXT
columns, as explained in the following sections.
Compression of B-Tree Pages
Because they are frequently updated, B-tree pages require special treatment. It is important to minimize the number of times B-tree nodes are split, as well as to minimize the need to uncompress and recompress their content.
One technique MySQL uses is to maintain some system information in the B-tree node in uncompressed form, thus facilitating certain in-place updates. For example, this allows rows to be delete-marked and deleted without any compression operation.
In addition, MySQL attempts to avoid unnecessary uncompression and recompression of index pages when they are changed. Within each B-tree page, the system keeps an uncompressed “modification log” to record changes made to the page. Updates and inserts of small records may be written to this modification log without requiring the entire page to be completely reconstructed.
When the space for the modification log runs out, InnoDB uncompresses the page, applies the changes and recompresses the page. If recompression fails (a situation known as a compression failure), the B-tree nodes are split and the process is repeated until the update or insert succeeds.
To avoid frequent compression failures in write-intensive workloads, such as for OLTP applications, MySQL sometimes reserves some empty space (padding) in the page, so that the modification log fills up sooner and the page is recompressed while there is still enough room to avoid splitting it. The amount of padding space left in each page varies as the system keeps track of the frequency of page splits. On a busy server doing frequent writes to compressed tables, you can adjust the innodb_compression_failure_threshold_pct
, and innodb_compression_pad_pct_max
configuration options to fine-tune this mechanism.
Generally, MySQL requires that each B-tree page in an InnoDB table can accommodate at least two records. For compressed tables, this requirement has been relaxed. Leaf pages of B-tree nodes (whether of the primary key or secondary indexes) only need to accommodate one record, but that record must fit, in uncompressed form, in the per-page modification log. If innodb_strict_mode
is ON
, MySQL checks the maximum row size during CREATE TABLE
or CREATE INDEX
. If the row does not fit, the following error message is issued: ERROR HY000: Too big row
.
If you create a table when innodb_strict_mode
is OFF, and a subsequent INSERT
or UPDATE
statement attempts to create an index entry that does not fit in the size of the compressed page, the operation fails with ERROR 42000: Row size too large
. (This error message does not name the index for which the record is too large, or mention the length of the index record or the maximum record size on that particular index page.) To solve this problem, rebuild the table with ALTER TABLE
and select a larger compressed page size (KEY_BLOCK_SIZE
), shorten any column prefix indexes, or disable compression entirely with ROW_FORMAT=DYNAMIC
or ROW_FORMAT=COMPACT
.
innodb_strict_mode
is not applicable to general tablespaces, which also support compressed tables. Tablespace management rules for general tablespaces are strictly enforced independently of innodb_strict_mode
. For more information, see Section 13.1.21, “CREATE TABLESPACE Statement”.
Compressing BLOB, VARCHAR, and TEXT Columns
In an InnoDB table, BLOB
, VARCHAR
, and TEXT
columns that are not part of the primary key may be stored on separately allocated overflow pages. We refer to these columns as off-page columns. Their values are stored on singly-linked lists of overflow pages.
For tables created in ROW_FORMAT=DYNAMIC
or ROW_FORMAT=COMPRESSED
, the values of BLOB
, TEXT
, or VARCHAR
columns may be stored fully off-page, depending on their length and the length of the entire row. For columns that are stored off-page, the clustered index record only contains 20-byte pointers to the overflow pages, one per column. Whether any columns are stored off-page depends on the page size and the total size of the row. When the row is too long to fit entirely within the page of the clustered index, MySQL chooses the longest columns for off-page storage until the row fits on the clustered index page. As noted above, if a row does not fit by itself on a compressed page, an error occurs.
For tables created in ROW_FORMAT=DYNAMIC
or ROW_FORMAT=COMPRESSED
, TEXT
and BLOB
columns that are less than or equal to 40 bytes are always stored in-line.
Tables that use ROW_FORMAT=REDUNDANT
and ROW_FORMAT=COMPACT
store the first 768 bytes of BLOB
, VARCHAR
, and TEXT
columns in the clustered index record along with the primary key. The 768-byte prefix is followed by a 20-byte pointer to the overflow pages that contain the rest of the column value.
When a table is in COMPRESSED
format, all data written to overflow pages is compressed “as is”; that is, MySQL applies the zlib compression algorithm to the entire data item. Other than the data, compressed overflow pages contain an uncompressed header and trailer comprising a page checksum and a link to the next overflow page, among other things. Therefore, very significant storage savings can be obtained for longer BLOB
, TEXT
, or VARCHAR
columns if the data is highly compressible, as is often the case with text data. Image data, such as JPEG
, is typically already compressed and so does not benefit much from being stored in a compressed table; the double compression can waste CPU cycles for little or no space savings.
The overflow pages are of the same size as other pages. A row containing ten columns stored off-page occupies ten overflow pages, even if the total length of the columns is only 8K bytes. In an uncompressed table, ten uncompressed overflow pages occupy 160K bytes. In a compressed table with an 8K page size, they occupy only 80K bytes. Thus, it is often more efficient to use compressed table format for tables with long column values.
For file-per-table tablespaces, using a 16K compressed page size can reduce storage and I/O costs for BLOB
, VARCHAR
, or TEXT
columns, because such data often compress well, and might therefore require fewer overflow pages, even though the B-tree nodes themselves take as many pages as in the uncompressed form. General tablespaces do not support a 16K compressed page size (KEY_BLOCK_SIZE
). For more information, see Section 15.6.3.3, “General Tablespaces”.
Compression and the InnoDB Buffer Pool
In a compressed InnoDB
table, every compressed page (whether 1K, 2K, 4K or 8K) corresponds to an uncompressed page of 16K bytes (or a smaller size if innodb_page_size
is set). To access the data in a page, MySQL reads the compressed page from disk if it is not already in the buffer pool, then uncompresses the page to its original form. This section describes how InnoDB
manages the buffer pool with respect to pages of compressed tables.
To minimize I/O and to reduce the need to uncompress a page, at times the buffer pool contains both the compressed and uncompressed form of a database page. To make room for other required database pages, MySQL can evict from the buffer pool an uncompressed page, while leaving the compressed page in memory. Or, if a page has not been accessed in a while, the compressed form of the page might be written to disk, to free space for other data. Thus, at any given time, the buffer pool might contain both the compressed and uncompressed forms of the page, or only the compressed form of the page, or neither.
MySQL keeps track of which pages to keep in memory and which to evict using a least-recently-used (LRU) list, so that hot (frequently accessed) data tends to stay in memory. When compressed tables are accessed, MySQL uses an adaptive LRU algorithm to achieve an appropriate balance of compressed and uncompressed pages in memory. This adaptive algorithm is sensitive to whether the system is running in an I/O-bound or CPU-bound manner. The goal is to avoid spending too much processing time uncompressing pages when the CPU is busy, and to avoid doing excess I/O when the CPU has spare cycles that can be used for uncompressing compressed pages (that may already be in memory). When the system is I/O-bound, the algorithm prefers to evict the uncompressed copy of a page rather than both copies, to make more room for other disk pages to become memory resident. When the system is CPU-bound, MySQL prefers to evict both the compressed and uncompressed page, so that more memory can be used for “hot” pages and reducing the need to uncompress data in memory only in compressed form.
Compression and the InnoDB Redo Log Files
Before a compressed page is written to a data file, MySQL writes a copy of the page to the redo log (if it has been recompressed since the last time it was written to the database). This is done to ensure that redo logs are usable for crash recovery, even in the unlikely case that the zlib
library is upgraded and that change introduces a compatibility problem with the compressed data. Therefore, some increase in the size of log files, or a need for more frequent checkpoints, can be expected when using compression. The amount of increase in the log file size or checkpoint frequency depends on the number of times compressed pages are modified in a way that requires reorganization and recompression.
To create a compressed table in a file-per-table tablespace, innodb_file_per_table
must be enabled. There is no dependence on the innodb_file_per_table
setting when creating a compressed table in a general tablespace. For more information, see Section 15.6.3.3, “General Tablespaces”.
15.9.1.6 Compression for OLTP Workloads
Traditionally, the InnoDB
compression feature was recommended primarily for read-only or read-mostly workloads, such as in a data warehouse configuration. The rise of SSD storage devices, which are fast but relatively small and expensive, makes compression attractive also for OLTP
workloads: high-traffic, interactive websites can reduce their storage requirements and their I/O operations per second (IOPS) by using compressed tables with applications that do frequent INSERT
, UPDATE
, and DELETE
operations.
These configuration options let you adjust the way compression works for a particular MySQL instance, with an emphasis on performance and scalability for write-intensive operations:
-
innodb_compression_level
lets you turn the degree of compression up or down. A higher value lets you fit more data onto a storage device, at the expense of more CPU overhead during compression. A lower value lets you reduce CPU overhead when storage space is not critical, or you expect the data is not especially compressible. -
innodb_compression_failure_threshold_pct
specifies a cutoff point for compression failures during updates to a compressed table. When this threshold is passed, MySQL begins to leave additional free space within each new compressed page, dynamically adjusting the amount of free space up to the percentage of page size specified byinnodb_compression_pad_pct_max
-
innodb_compression_pad_pct_max
lets you adjust the maximum amount of space reserved within each page to record changes to compressed rows, without needing to compress the entire page again. The higher the value, the more changes can be recorded without recompressing the page. MySQL uses a variable amount of free space for the pages within each compressed table, only when a designated percentage of compression operations “fail” at runtime, requiring an expensive operation to split the compressed page. -
innodb_log_compressed_pages
lets you disable writing of images of re-compressed pages to the redo log. Re-compression may occur when changes are made to compressed data. This option is enabled by default to prevent corruption that could occur if a different version of thezlib
compression algorithm is used during recovery. If you are certain that thezlib
version will not change, disableinnodb_log_compressed_pages
to reduce redo log generation for workloads that modify compressed data.
Because working with compressed data sometimes involves keeping both compressed and uncompressed versions of a page in memory at the same time, when using compression with an OLTP-style workload, be prepared to increase the value of the innodb_buffer_pool_size
configuration option.
15.9.1.7 SQL Compression Syntax Warnings and Errors
This section describes syntax warnings and errors that you may encounter when using the table compression feature with file-per-table tablespaces and general tablespaces.
SQL Compression Syntax Warnings and Errors for File-Per-Table Tablespaces
When innodb_strict_mode
is enabled (the default), specifying ROW_FORMAT=COMPRESSED
or KEY_BLOCK_SIZE
in CREATE TABLE
or ALTER TABLE
statements produces the following error if innodb_file_per_table
is disabled.
ERROR 1031 (HY000): Table storage engine for 't1' doesn't have this optionNote
The table is not created if the current configuration does not permit using compressed tables.
When innodb_strict_mode
is disabled, specifying ROW_FORMAT=COMPRESSED
or KEY_BLOCK_SIZE
in CREATE TABLE
or ALTER TABLE
statements produces the following warnings if innodb_file_per_table
is disabled.
mysql> SHOW WARNINGS;
+---------+------+---------------------------------------------------------------+
| Level | Code | Message |
+---------+------+---------------------------------------------------------------+
| Warning | 1478 | InnoDB: KEY_BLOCK_SIZE requires innodb_file_per_table. |
| Warning | 1478 | InnoDB: ignoring KEY_BLOCK_SIZE=4. |
| Warning | 1478 | InnoDB: ROW_FORMAT=COMPRESSED requires innodb_file_per_table. |
| Warning | 1478 | InnoDB: assuming ROW_FORMAT=DYNAMIC. |
+---------+------+---------------------------------------------------------------+
Note
These messages are only warnings, not errors, and the table is created without compression, as if the options were not specified.
The “non-strict” behavior lets you import a mysqldump
file into a database that does not support compressed tables, even if the source database contained compressed tables. In that case, MySQL creates the table in ROW_FORMAT=DYNAMIC
instead of preventing the operation.
To import the dump file into a new database, and have the tables re-created as they exist in the original database, ensure the server has the proper setting for the innodb_file_per_table
configuration parameter.
The attribute KEY_BLOCK_SIZE
is permitted only when ROW_FORMAT
is specified as COMPRESSED
or is omitted. Specifying a KEY_BLOCK_SIZE
with any other ROW_FORMAT
generates a warning that you can view with SHOW WARNINGS
. However, the table is non-compressed; the specified KEY_BLOCK_SIZE
is ignored).
Level | Code | Message |
---|---|---|
Warning | 1478 | InnoDB: ignoring KEY_BLOCK_SIZE= |
If you are running with innodb_strict_mode
enabled, the combination of a KEY_BLOCK_SIZE
with any ROW_FORMAT
other than COMPRESSED
generates an error, not a warning, and the table is not created.
Table 15.12, “ROW_FORMAT and KEY_BLOCK_SIZE Options” provides an overview the ROW_FORMAT
and KEY_BLOCK_SIZE
options that are used with CREATE TABLE
or ALTER TABLE
.
Table 15.12 ROW_FORMAT and KEY_BLOCK_SIZE Options
Option | Usage Notes | Description |
---|---|---|
ROW_FORMAT=REDUNDANT |
Storage format used prior to MySQL 5.0.3 | Less efficient than ROW_FORMAT=COMPACT ; for backward compatibility |
ROW_FORMAT=COMPACT |
Default storage format since MySQL 5.0.3 | Stores a prefix of 768 bytes of long column values in the clustered index page, with the remaining bytes stored in an overflow page |
ROW_FORMAT=DYNAMIC |
Store values within the clustered index page if they fit; if not, stores only a 20-byte pointer to an overflow page (no prefix) | |
ROW_FORMAT=COMPRESSED |
Compresses the table and indexes using zlib | |
KEY_BLOCK_SIZE= |
Specifies compressed page size of 1, 2, 4, 8 or 16 kilobytes; implies ROW_FORMAT=COMPRESSED . For general tablespaces, a KEY_BLOCK_SIZE value equal to the InnoDB page size is not permitted. |
Table 15.13,
“CREATE/ALTER TABLE Warnings and Errors when InnoDB Strict Mode is OFF”
summarizes error conditions that occur with certain combinations of
configuration parameters and options on the CREATE
TABLE
or ALTER
TABLE
statements, and how the options appear in the output of SHOW TABLE STATUS
.
When innodb_strict_mode
is OFF
,
MySQL creates or alters the table, but ignores certain settings as shown below.
You can see the warning messages in the MySQL error log. When innodb_strict_mode
is ON
,
these specified combinations of options generate errors, and the table is not
created or altered. To see the full description of the error condition, issue
the SHOW ERRORS
statement: example:
mysql>CREATE TABLE x (id INT PRIMARY KEY, c INT)
->ENGINE=INNODB KEY_BLOCK_SIZE=33333;
ERROR 1005 (HY000): Can't create table 'test.x' (errno: 1478) mysql>SHOW ERRORS;
+-------+------+-------------------------------------------+ | Level | Code | Message | +-------+------+-------------------------------------------+ | Error | 1478 | InnoDB: invalid KEY_BLOCK_SIZE=33333. | | Error | 1005 | Can't create table 'test.x' (errno: 1478) | +-------+------+-------------------------------------------+
Table 15.13 CREATE/ALTER TABLE Warnings and Errors when InnoDB Strict Mode is OFF
Syntax | Warning or Error Condition | Resulting ROW_FORMAT , as shown in SHOW TABLE STATUS
|
---|---|---|
ROW_FORMAT=REDUNDANT |
None | REDUNDANT |
ROW_FORMAT=COMPACT |
None | COMPACT |
ROW_FORMAT=COMPRESSED or ROW_FORMAT=DYNAMIC or KEY_BLOCK_SIZE is specified |
Ignored for file-per-table tablespaces unless innodb_file_per_table is enabled. General tablespaces support all row formats. See Section 15.6.3.3, “General Tablespaces”. |
the default row format for file-per-table tablespaces; the specified row format for general tablespaces |
Invalid KEY_BLOCK_SIZE is specified (not 1, 2, 4, 8 or 16) |
KEY_BLOCK_SIZE is ignored |
the specified row format, or the default row format |
ROW_FORMAT=COMPRESSED and valid KEY_BLOCK_SIZE are specified |
None; KEY_BLOCK_SIZE specified is used |
COMPRESSED |
KEY_BLOCK_SIZE is specified with REDUNDANT , COMPACT or DYNAMIC row format |
KEY_BLOCK_SIZE is ignored |
REDUNDANT , COMPACT or DYNAMIC
|
ROW_FORMAT is not one of REDUNDANT , COMPACT , DYNAMIC or COMPRESSED
|
Ignored if recognized by the MySQL parser. Otherwise, an error is issued. | the default row format or N/A |
When innodb_strict_mode
is ON
, MySQL rejects invalid ROW_FORMAT
or KEY_BLOCK_SIZE
parameters and issues errors. Strict mode is ON
by
default. When innodb_strict_mode
is OFF
, MySQL issues warnings instead of errors for ignored
invalid parameters.
It is not possible to see the chosen KEY_BLOCK_SIZE
using SHOW TABLE
STATUS
. The statement SHOW CREATE TABLE
displays the KEY_BLOCK_SIZE
(even if it was ignored
when creating the table). The real compressed page size of the table cannot be
displayed by MySQL.
SQL Compression Syntax Warnings and Errors for General Tablespaces
-
If
FILE_BLOCK_SIZE
was not defined for the general tablespace when the tablespace was created, the tablespace cannot contain compressed tables. If you attempt to add a compressed table, an error is returned, as shown in the following example:mysql>
CREATE TABLESPACE `ts1` ADD DATAFILE 'ts1.ibd' Engine=InnoDB;
mysql>CREATE TABLE t1 (c1 INT PRIMARY KEY) TABLESPACE ts1 ROW_FORMAT=COMPRESSED
KEY_BLOCK_SIZE=8;
ERROR 1478 (HY000): InnoDB: Tablespace `ts1` cannot contain a COMPRESSED table -
Attempting to add a table with an invalid
KEY_BLOCK_SIZE
to a general tablespace returns an error, as shown in the following example:mysql>
CREATE TABLESPACE `ts2` ADD DATAFILE 'ts2.ibd' FILE_BLOCK_SIZE = 8192 Engine=InnoDB;
mysql>CREATE TABLE t2 (c1 INT PRIMARY KEY) TABLESPACE ts2 ROW_FORMAT=COMPRESSED
KEY_BLOCK_SIZE=4;
ERROR 1478 (HY000): InnoDB: Tablespace `ts2` uses block size 8192 and cannot contain a table with physical page size 4096For general tablespaces, the
KEY_BLOCK_SIZE
of the table must be equal to theFILE_BLOCK_SIZE
of the tablespace divided by 1024. For example, if theFILE_BLOCK_SIZE
of the tablespace is 8192, theKEY_BLOCK_SIZE
of the table must be 8. -
Attempting to add a table with an uncompressed row format to a general tablespace configured to store compressed tables returns an error, as shown in the following example:
mysql>
CREATE TABLESPACE `ts3` ADD DATAFILE 'ts3.ibd' FILE_BLOCK_SIZE = 8192 Engine=InnoDB;
mysql>CREATE TABLE t3 (c1 INT PRIMARY KEY) TABLESPACE ts3 ROW_FORMAT=COMPACT;
ERROR 1478 (HY000): InnoDB: Tablespace `ts3` uses block size 8192 and cannot contain a table with physical page size 16384
innodb_strict_mode
is not applicable to general tablespaces. Tablespace management rules for general tablespaces are strictly enforced independently of innodb_strict_mode
. For more information, see Section 13.1.21, “CREATE TABLESPACE Statement”.
For more information about using compressed tables with general tablespaces, see Section 15.6.3.3, “General Tablespaces”.
15.9.2 InnoDB Page Compression
InnoDB
supports page-level compression for tables that reside in file-per-table tablespaces. This feature is referred to as Transparent Page Compression. Page compression is enabled by specifying the COMPRESSION
attribute with CREATE TABLE
or ALTER TABLE
. Supported compression algorithms include Zlib
and LZ4
.
Supported Platforms
Page compression requires sparse file and hole punching support. Page compression is supported on Windows with NTFS, and on the following subset of MySQL-supported Linux platforms where the kernel level provides hole punching support:
-
RHEL 7 and derived distributions that use kernel version 3.10.0-123 or higher
-
OEL 5.10 (UEK2) kernel version 2.6.39 or higher
-
OEL 6.5 (UEK3) kernel version 3.8.13 or higher
-
OEL 7.0 kernel version 3.8.13 or higher
-
SLE11 kernel version 3.0-x
-
SLE12 kernel version 3.12-x
-
OES11 kernel version 3.0-x
-
Ubuntu 14.0.4 LTS kernel version 3.13 or higher
-
Ubuntu 12.0.4 LTS kernel version 3.2 or higher
-
Debian 7 kernel version 3.2 or higher
All of the available file systems for a given Linux distribution may not support hole punching.
How Page Compression Works
When a page is written, it is compressed using the specified compression algorithm. The compressed data is written to disk, where the hole punching mechanism releases empty blocks from the end of the page. If compression fails, data is written out as-is.
Hole Punch Size on Linux
On Linux systems, the file system block size is the unit size used for hole punching. Therefore, page compression only works if page data can be compressed to a size that is less than or equal to the InnoDB
page size minus the file system block size. For example, if innodb_page_size=16K
and the file system block size is 4K, page data must compress to less than or equal to 12K to make hole punching possible.
Hole Punch Size on Windows
On Windows systems, the underlying infrastructure for sparse files is based on NTFS compression. Hole punching size is the NTFS compression unit, which is 16 times the NTFS cluster size. Cluster sizes and their compression units are shown in the following table:
Table 15.14 Windows NTFS Cluster Size and Compression Units
Cluster Size | Compression Unit |
---|---|
512 Bytes | 8 KB |
1 KB | 16 KB |
2 KB | 32 KB |
4 KB | 64 KB |
Page compression on Windows systems only works if page data can be compressed
to a size that is less than or equal to the InnoDB
page size minus the compression unit size.
The default NTFS cluster size is 4KB, for which the compression unit size is
64KB. This means that page compression has no benefit for an out-of-the box
Windows NTFS configuration, as the maximum innodb_page_size
is also 64KB.
For page compression to work on Windows, the file system must be created with
a cluster size smaller than 4K, and the innodb_page_size
must be at least twice the size of the
compression unit. For example, for page compression to work on Windows, you
could build the file system with a cluster size of 512 Bytes (which has a
compression unit of 8KB) and initialize InnoDB
with
an innodb_page_size
value of 16K or greater.
Enabling Page Compression
To enable page compression, specify the COMPRESSION
attribute in the CREATE
TABLE
statement. For example:
CREATE TABLE t1 (c1 INT) COMPRESSION="zlib";
You can also enable page compression in an ALTER TABLE
statement. However, ALTER TABLE ... COMPRESSION
only updates the tablespace compression attribute. Writes to the tablespace that occur after setting the new compression algorithm use the new setting, but to apply the new compression algorithm to existing pages, you must rebuild the table using OPTIMIZE TABLE
.
ALTER TABLE t1 COMPRESSION="zlib"; OPTIMIZE TABLE t1;
Disabling Page Compression
To disable page compression, set COMPRESSION=None
using ALTER TABLE
. Writes to the tablespace that occur after setting COMPRESSION=None
no longer use page compression. To uncompress existing pages, you must rebuild the table using OPTIMIZE TABLE
after setting COMPRESSION=None
.
ALTER TABLE t1 COMPRESSION="None"; OPTIMIZE TABLE t1;
Page Compression Metadata
Page compression metadata is found in the INFORMATION_SCHEMA.INNODB_TABLESPACES
table, in the following columns:
-
FS_BLOCK_SIZE
: The file system block size, which is the unit size used for hole punching. -
FILE_SIZE
: The apparent size of the file, which represents the maximum size of the file, uncompressed. -
ALLOCATED_SIZE
: The actual size of the file, which is the amount of space allocated on disk.
On Unix-like systems, ls -l
shows the apparent file size (equivalent to tablespace_name.ibd
FILE_SIZE
) in bytes. To view the actual amount of space allocated on disk (equivalent to ALLOCATED_SIZE
), use du --block-size=1
. The tablespace_name.ibd
--block-size=1
option prints the allocated space in bytes instead of blocks, so that it can be compared to ls -l
output.
Use SHOW CREATE TABLE
to view the current page compression setting (Zlib
, Lz4
, or None
). A table may contain a mix of pages with different compression settings.
In the following example, page compression metadata for the employees table is retrieved from the INFORMATION_SCHEMA.INNODB_TABLESPACES
table.
# Create the employees table with Zlib page compression CREATE TABLE employees ( emp_no INT NOT NULL, birth_date DATE NOT NULL, first_name VARCHAR(14) NOT NULL, last_name VARCHAR(16) NOT NULL, gender ENUM ('M','F') NOT NULL, hire_date DATE NOT NULL, PRIMARY KEY (emp_no) ) COMPRESSION="zlib"; # Insert data (not shown) # Query page compression metadata in INFORMATION_SCHEMA.INNODB_TABLESPACES mysql>SELECT SPACE, NAME, FS_BLOCK_SIZE, FILE_SIZE, ALLOCATED_SIZE FROM
INFORMATION_SCHEMA.INNODB_TABLESPACES WHERE NAME='employees/employees'\G
*************************** 1. row *************************** SPACE: 45 NAME: employees/employees FS_BLOCK_SIZE: 4096 FILE_SIZE: 23068672 ALLOCATED_SIZE: 19415040
Page compression metadata for the employees table shows that the apparent file size is 23068672 bytes while the actual file size (with page compression) is 19415040 bytes. The file system block size is 4096 bytes, which is the block size used for hole punching.
Identifying Tables Using Page Compression
To identify tables for which page compression is enabled, you can query the INFORMATION_SCHEMA.TABLES
CREATE_OPTIONS
column for tables defined with the COMPRESSION
attribute:
mysql>SELECT TABLE_NAME, TABLE_SCHEMA, CREATE_OPTIONS FROM INFORMATION_SCHEMA.TABLES
WHERE CREATE_OPTIONS LIKE '%COMPRESSION=%';
+------------+--------------+--------------------+ | TABLE_NAME | TABLE_SCHEMA | CREATE_OPTIONS | +------------+--------------+--------------------+ | employees | test | COMPRESSION="zlib" | +------------+--------------+--------------------+
SHOW CREATE TABLE
also shows the COMPRESSION
attribute, if used.
Page Compression Limitations and Usage Notes
-
Page compression is disabled if the file system block size (or compression unit size on Windows) * 2 >
innodb_page_size
. -
Page compression is not supported for tables that reside in shared tablespaces, which include the system tablespace, temporary tablespaces, and general tablespaces.
-
Page compression is not supported for undo log tablespaces.
-
Page compression is not supported for redo log pages.
-
R-tree pages, which are used for spatial indexes, are not compressed.
-
Pages that belong to compressed tables (
ROW_FORMAT=COMPRESSED
) are left as-is. -
During recovery, updated pages are written out in an uncompressed form.
-
Loading a page-compressed tablespace on a server that does not support the compression algorithm that was used causes an I/O error.
-
Before downgrading to an earlier version of MySQL that does not support page compression, uncompress the tables that use the page compression feature. To uncompress a table, run
ALTER TABLE ... COMPRESSION=None
andOPTIMIZE TABLE
. -
Page-compressed tablespaces can be copied between Linux and Windows servers if the compression algorithm that was used is available on both servers.
-
Preserving page compression when moving a page-compressed tablespace file from one host to another requires a utility that preserves sparse files.
-
Better page compression may be achieved on Fusion-io hardware with NVMFS than on other platforms, as NVMFS is designed to take advantage of punch hole functionality.
-
Using the page compression feature with a large
InnoDB
page size and relatively small file system block size could result in write amplification. For example, a maximumInnoDB
page size of 64KB with a 4KB file system block size may improve compression but may also increase demand on the buffer pool, leading to increased I/O and potential write amplification.