MySQL中的UUID性能?

2021-11-20 00:00:00 performance sequence uuid mysql innodb

我们正在考虑使用 UUID 值作为 MySQL 数据库的主键.插入的数据是从数十、数百甚至数千台远程计算机生成的，并且以每秒 100-40,000 次插入的速度插入，我们永远不会进行任何更新.

We're considering using UUID values as primary keys for our MySQL database. The data being inserted is generated from dozens, hundreds, or even thousands of remote computers and being inserted at a rate of 100-40,000 inserts per second, and we'll never do any updates.

在我们开始剔除数据之前，数据库本身通常会达到大约 5000 万条记录，因此不是一个庞大的数据库，但也不是很小.我们还计划在 InnoDB 上运行，但如果有更好的引擎来支持我们的工作，我们愿意改变这种情况.

The database itself will typically get to around 50M records before we start to cull data, so not a massive database, but not tiny either. We're also planing to run on InnoDB, though we are open to changing that if there is a better engine for what we're doing.

我们已准备好使用 Java 的 Type 4 UUID，但在测试中发现了一些奇怪的行为.一方面，我们存储为 varchar(36)，我现在意识到我们最好使用 binary(16) - 尽管我不确定有多好.

We were ready to go with Java's Type 4 UUID, but in testing have been seeing some strange behavior. For one, we're storing as varchar(36) and I now realize we'd be better off using binary(16) - though how much better off I'm not sure.

更大的问题是:当我们有 50M 记录时，这些随机数据对索引的影响有多严重?例如，如果我们使用最左边的位带有时间戳的类型 1 UUID，我们会不会更好?或者我们应该完全放弃 UUID 并考虑使用 auto_increment 主键?

The bigger question is: how badly does this random data screw up the index when we have 50M records? Would we be better off if we used, for example, a type-1 UUID where the leftmost bits were timestamped? Or maybe we should ditch UUIDs entirely and consider auto_increment primary keys?

我正在寻找有关将不同类型的 UUID 存储为 MySQL 中的索引/主键时的性能的一般想法/提示.谢谢！

I'm looking for general thoughts/tips on the performance of different types of UUIDs when they are stored as an index/primary key in MySQL. Thanks!

推荐答案

UUID 是通用唯一 ID.这是您在这里应该考虑的通用部分.

A UUID is a Universally Unique ID. It's the universally part that you should be considering here.

您真的需要 ID 是普遍唯一的吗?如果是这样，那么 UUID 可能是您唯一的选择.

Do you really need the IDs to be universally unique? If so, then UUIDs may be your only choice.

我强烈建议，如果您确实使用 UUID，请将它们存储为数字而不是字符串.如果你有 50M+ 的记录，那么节省的存储空间会提高你的性能(虽然我说不上多少).

I would strongly suggest that if you do use UUIDs, you store them as a number and not as a string. If you have 50M+ records, then the saving in storage space will improve your performance (although I couldn't say by how much).

如果您的 ID 不需要普遍唯一，那么我认为您不会比使用 auto_increment 做得更好，这保证 ID 在表中是唯一的(因为值每次都会增加)

If your IDs do not need to be universally unique, then I don't think that you can do much better then just using auto_increment, which guarantees that IDs will be unique within a table (since the value will increment each time)

相关文章