在一个巨大的 MySQL 生产表上创建一个没有表锁定的索引

我需要在大约 5M 行的 MySQL 表上创建一个索引.它是一个生产表,如果我运行 CREATE INDEX 语句,我担心所有内容都会被完全阻塞...

I need to create an index on a ~5M rows MySQL table. It is a production table, and I fear a complete block of everything if I run a CREATE INDEX statement...

有没有办法在不阻塞插入和选择的情况下创建索引?

Is there a way to create that index without blocking inserts and selects?

只是想知道我不必停止,创建索引并重新启动我的系统!

Just wondering I have not to stop, create index and restart my system!

推荐答案

[2017] 更新:MySQL 5.6 支持在线索引更新

https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-operations.html#online-ddl-index-syntax-notes

在 MySQL 5.6 及更高版本中,在创建或删除索引时,该表仍可用于读取和写入操作.CREATE INDEX 或 DROP INDEX 语句仅在访问表的所有事务完成后才完成,以便索引的初始状态反映表的最新内容.以前,在创建或删除索引时修改表通常会导致死锁,从而取消表上的 INSERT、UPDATE 或 DELETE 语句.

In MySQL 5.6 and higher, the table remains available for read and write operations while the index is being created or dropped. The CREATE INDEX or DROP INDEX statement only finishes after all transactions that are accessing the table are completed, so that the initial state of the index reflects the most recent contents of the table. Previously, modifying the table while an index is being created or dropped typically resulted in a deadlock that cancelled the INSERT, UPDATE, or DELETE statement on the table.

[2015] 在 MySQL 5.5 中更新表索引块写入

从上面的答案:

如果您使用的版本大于 5.1 的索引是在数据库在线时创建的.所以不用担心您不会中断生产系统的使用."

这是 ****FALSE****(至少对于 MyISAM/InnoDB 表,这是 99.999% 的人使用的表.集群版不同.)

This is ****FALSE**** (at least for MyISAM / InnoDB tables, which is what 99.999% of people out there use. Clustered Edition is different.)

对表执行 UPDATE 操作将在创建索引时BLOCK.MySQL 在这方面(以及其他一些事情)真的非常非常愚蠢.

Doing UPDATE operations on a table will BLOCK while the index is being created. MySQL is really, really stupid about this (and a few other things).

测试脚本:

(   
  for n in {1..50}; do
    #(time mysql -uroot -e 'select  * from website_development.users where id = 41225\G'>/dev/null) 2>&1 | grep real;
    (time mysql -uroot -e 'update website_development.users set bio="" where id = 41225\G'>/dev/null) 2>&1 | grep real;
  done
) | cat -n &
PID=$!
sleep 0.05
echo "Index Update - START"
mysql -uroot website_development -e 'alter table users add index ddopsonfu (last_name, email, first_name, confirmation_token, current_sign_in_ip);'
echo "Index Update - FINISH"
sleep 0.05
kill $PID
time mysql -uroot website_development -e 'drop index ddopsonfu on users;'

我的服务器(InnoDB):

My Server (InnoDB):

Server version: 5.5.25a Source distribution

输出(注意第 6 个操作如何阻塞完成索引更新所需的约 400 毫秒):

Output (notice how the 6th operation blocks for the ~400ms it takes to finish the index update):

 1  real    0m0.009s
 2  real    0m0.009s
 3  real    0m0.009s
 4  real    0m0.012s
 5  real    0m0.009s
Index Update - START
Index Update - FINISH
 6  real    0m0.388s
 7  real    0m0.009s
 8  real    0m0.009s
 9  real    0m0.009s
10  real    0m0.009s
11  real    0m0.009s

vs 不阻塞的读操作(交换脚本中的行注释):

Vs read operations which don't block (swap the line comment in the script):

 1  real    0m0.010s
 2  real    0m0.009s
 3  real    0m0.009s
 4  real    0m0.010s
 5  real    0m0.009s
Index Update - START
 6  real    0m0.010s
 7  real    0m0.010s
 8  real    0m0.011s
 9  real    0m0.010s
...
41  real    0m0.009s
42  real    0m0.010s
43  real    0m0.009s
Index Update - FINISH
44  real    0m0.012s
45  real    0m0.009s
46  real    0m0.009s
47  real    0m0.010s
48  real    0m0.009s

无需停机即可更新 MySQL 的架构

到目前为止,我只知道一种方法来更新 MySql 架构并且不会遭受可用性中断.圆形大师:

Updating MySQL's Schema without downtime

Thusfar, there's only one method I know of to update a MySql schema and not suffer an availability outage. Circular masters:

  • Master A 在其上运行您的 MySQL 数据库
  • 使 Master B 投入使用并让它从 Master A 复制写入(B 是 A 的从属)
  • 在主 B 上执行架构更新.它会在升级过程中落后
  • 让B大师赶上.不变:您的模式更改必须能够处理从向下版本模式复制的命令.索引更改符合条件.简单的列添加通常符合条件.删除列?可能不是.
  • 原子地将所有客户端从 Master A 交换到 Master B.如果你想安全(相信我,你会这样做),你应该确保对 A 的最后一次写入被复制到 B BEFORE B进行第一次写入.如果您允许并发写入 2 个以上的 master,...您最好在 DEEP 级别了解 MySQL 复制,否则您将进入一个痛苦的世界.极度痛苦.比如,你有一个 AUTOINCREMENT 列吗???你被搞砸了(除非你在一个大师上使用偶数而在另一个上使用赔率).不要相信 MySQL 复制会做正确的事".它不聪明,不会拯救你.与从命令行复制二进制事务日志并手动重放它们相比,它的安全性稍差.尽管如此,将所有客户端与旧主服务器断开连接并将它们切换到新主服务器可以在几秒钟内完成,这比等待数小时的架构升级要快得多.
  • 现在,大师 B 是您的新大师.你有新的架构.生活很好.喝杯啤酒;最糟糕的时候已经过去了.
  • 对主 A 重复这个过程,升级他的架构,使他成为您新的辅助主,准备好在您的主主(现在是主 B)断电或死在您身上时接管.
  • 莉>
  • Master A has your MySQL database running on it
  • Bring Master B into service and have it replicate writes from Master A ( B is a slave of A)
  • Perform the schema update on Master B. It will fall behind during the upgrade
  • Let Master B catch up. Invariant: Your schema change MUST be capable of processing commands replicated from a downversion schema. Indexing changes qualify. Simple column additions usually qualify. Removing a column? probably not.
  • ATOMICALLY swap all clients from Master A to Master B. If you want to be safe (trust me, you do), you should ensure that the last write to A is replicated to B BEFORE B takes its first write. If you allow concurrent writes to 2+ masters, ... you better understand MySQL replication at a DEEP level or you are headed for a world of pain. Extreme pain. Like, do you have a column that is AUTOINCREMENT??? you are screwed (unless you use even numbers on one master and odds on the other). Do NOT trust MySQL replication to "do the right thing". It is NOT smart and will not save you. It's just slightly less safe than copying binary transaction logs from the command-line and replaying them by hand. Still, disconnecting all clients from the old master and flipping them to the new master can be done in a matter of seconds, vastly faster than waiting for a multi-hour schema upgrade.
  • Now Master B is your new master. You have the new schema. Life is good. Have a beer; the worst is over.
  • Repeat the process with Master A, upgrading his schema so that he becomes your new secondary master, ready to take over in the event that your primary master (master B now) loses power or just up and dies on you.

更新架构的简单方法不是.可在严酷的生产环境中工作;是的.请,请,请,如果有更简单的方法可以在不阻止写入的情况下向 MySQL 表添加索引,请告诉我.

An easy way to update schema this isn't. Workable in a serious production environment; yes, it is. Please, please, please, if there is an easier way to add an index to a MySQL table without blocking writes, let me know.

谷歌搜索引导我到这篇文章 描述了类似的技术.更好的是,他们建议在程序中的同一点喝酒(请注意,我在阅读文章之前写了我的答案)!

Googling lead me to this article which describes a similar technique. Even better, they advise drinking at the same point in the proceedure (Note that I wrote my answer before reading the article)!

文章我在上面链接了一个工具,pt-online-schema-change,其工作原理如下:

The article I linked above talks about a tool, pt-online-schema-change, that works as follows:

  • 创建与原始结构相同的新表.
  • 更新新表的架构.
  • 在原始表上添加触发器,以便更改与副本保持同步
  • 从原始表中批量复制行.
  • 将原来的桌子移开,换上新的桌子.
  • 放下旧桌子.

我自己从未尝试过该工具.赞一个

I've never tried the tool myself. YMMV

我目前正在通过 亚马逊的 RDS 使用 MySQL.这是一个非常棒的服务,它封装和管理 MySQL,让您只需一个按钮即可添加新的只读副本,并跨硬件 SKU 透明地升级数据库.真的很方便.您无法获得对数据库的 SUPER 访问权限,因此您无法直接进行复制(这是福还是祸?).但是,您可以使用 只读副本提升在只读从站上更改架构,然后将该从站提升为您的新主站.与我上面描述的技巧完全相同,只是更容易执行.他们仍然没有做太多帮助您完成切换.您必须重新配置并重新启动您的应用.

I'm currently using MySQL through Amazon's RDS. It's a really nifty service that wraps up and manages MySQL, letting you add new read replicas with a single button and transparently upgrade the database across hardware SKU's. It's really convenient. You don't get SUPER access to the database, so you can't screw with replication directly (is this a blessing or curse?). However, you can use Read Replica Promotion to make your schema changes on a read-only slave, then promote that slave to become your new master. Exactly the same trick as I described above, just vastly easier to execute. They still don't do much to help you with the cut-over. You have to reconfigure and restart your app.

相关文章