Why are duplicate indexes bad?
- They slow down insert, update, and delete queries (DML queries).
- The storage engine needs to maintain more index statistics.
- More disk space is needed to store the redundant indexes.
- The optimizer phase gets slower because more query plans are examined.
- The buffer pool for InnoDB might be used for useless index data.
- Duplicate indexes are deprecated (as of MySQL 5.6) and will be removed in a future version of MySQL. Version 5.6 issues a note if an index is created that duplicates an existing index, version 5.7 issues a warning.
Percona Toolkit to the rescue
The Percona Toolkit provides a tool called pt-duplicate-key-checker. This tool examines MySQL tables for duplicate or redundant indexes and foreign keys.
Note: The MySQL utilities provided by Oracle include a tool called mysqlindexcheck to identify potentially redundant table indexes. It is similar to Percona’s tool, but it does not detect all cases.
Installing Percona Toolkit
Installing the Percona Toolkit on a Mac using Homebrew is a simple task, but you might need to install a Perl module from the CPAN repository first:
sudo cpan -i DBD::mysql
brew install percona-toolkit
Using Percona Toolkit
Imagine a database called “cms”. To run the duplicate key checker on your localhost, try the following command. (If you do not use a password for your local database, you can omit the --ask-pass
parameter.)
pt-duplicate-key-checker --user root --ask-pass --databases cms
Running the command, you will get something like this:
# ########################################################################
# cms.user_organization
# ########################################################################
# user_id is a left-prefix of PRIMARY
# Key definitions:
# KEY `user_id` (`user_id`),
# PRIMARY KEY (`user_id`,`organization_id`),
# Column types:
# `user_id` int(10) unsigned not null
# `organization_id` int(10) unsigned not null
# To remove this duplicate index, execute:
ALTER TABLE `cms`.`user_organization` DROP INDEX `user_id`;
# ########################################################################
# Summary of indexes
# ########################################################################
# Size Duplicate Indexes 59044
# Total Duplicate Indexes 1
# Total Indexes 543
If a table has a multiple-column index, any leftmost prefix of the index can be used by the MySQL optimizer to find rows. Because the column user_id
is a leftmost prefix of the primary key in our example, we do not need to add a separate index and we can drop the duplicate index.
Basically, if you have a three-column index on (A, B, C), you have indexed search capabilities on (A), (A, B), and (A, B, C) covered.
If you are not familiar with multiple-column indexes, you should have a look at the chapter “Multiple-Column Indexes” in the MySQL reference manual.
Do manual checks
Always do a manual check before removing indexes. For example, you might have set the index to the wrong column, but actually need the index:
# ########################################################################
# energy.performancedata
# ########################################################################
# speed is a duplicate of height
# Key definitions:
# KEY `speed` (`height`),
# KEY `height` (`height`),
# Column types:
# `height` float default null
# To remove this duplicate index, execute:
ALTER TABLE `energy`.`performancedata` DROP INDEX `speed`;
In the example above, there are two indexes set to the column height
, but the name of the index speed
already suggests that it should be an index over the column speed
. So do not blindlessly remove indexes as suggested by the toolkit, but always do a manual check first!