Spring数据保存与saveAll性能

我试图理解为什么 saveAll 比保存在 Spring Data 存储库中的性能更好.我正在使用 CrudRepository 可以看到 这里.

I'm trying to understand why saveAll has better performance than save in the Spring Data repositories. I'm using CrudRepository which can be seen here.

为了测试,我创建并添加了 10k 个实体,这些实体只有一个 id 和一个随机字符串(对于基准测试,我将字符串保持为常量)到一个列表中.遍历我的列表并在每个元素上调用 .save 需要 40 秒.对同一整个列表调用 .saveAll 只需 2 秒.使用 30k 个元素调用 .saveAll 需要 4 秒.在执行每个测试之前,我确保截断我的表.即使将 .saveAll 调用批处理到 50 个子列表也需要 10 秒,而 30k.

To test I created and added 10k entities, which just have an id and a random string (for the benchmark I kept the string a constant), to a list. Iterating over my list and calling .save on each element, it took 40 seconds. Calling .saveAll on the same entire list completed in 2 seconds. Calling .saveAll with even 30k elements took 4 seconds. I made sure to truncate my table before performing each test. Even batching the .saveAll calls to sublists of 50 took 10 seconds with 30k.

带有整个列表的简单 .saveAll 似乎是最快的.

The simple .saveAll with the entire list seems to be the fastest.

我试图浏览 Spring Data 源代码,但 this 是我发现的唯一有价值的东西.这里似乎 .saveAll 只是简单地迭代整个 Iterable 并像我一样在每个人上调用 .save .那么它是如何快得多的呢?它是否在内部进行一些事务批处理?

I tried to browse the Spring Data source code but this is the only thing I found of value. Here it seems .saveAll simply iterates over the entire Iterable and calls .save on each one like I was doing. So how is it that much faster? Is it doing some transactional batching internally?

推荐答案

没有你的代码,我不得不猜测,我相信这与在 save 与在 saveAll 的情况下打开一笔交易.

Without having your code, I have to guess, I believe it has to do with the overhead of creating new transaction for each object saved in the case of save versus opening one transaction in the case of saveAll.

注意 savesaveAll 的定义,它们都用 @Transactional 注释.如果您的项目配置正确,这似乎是因为实体被保存到数据库中,这意味着每当调用这些方法之一时都会创建一个事务.如果您在循环中调用 save,这意味着每次调用 save 时都会创建一个新事务,但在 saveAll 的情况下是一次调用,因此无论保存的实体数量如何,都会创建一个事务.

Notice the definition of save and saveAll they are both annotated with @Transactional. If your project is configured properly, which seems to be the case since entities are being saved to the database, that means a transaction will be created whenever one of these methods are called. if you are calling save in a loop that means a new transaction is being created each time you call save, but in the case of saveAll there is one call and therefor one transaction created regardless of the number of entities being saved.

我假设测试本身并没有在事务中运行,如果要在事务中运行,那么所有对保存的调用都将在该事务中运行,因为默认的事务传播是 Propagation.REQUIRED,这意味着如果有一个事务已经打开,调用将在其中运行.如果您打算使用 spring 数据,我强烈建议您阅读 Spring 中的事务管理.

I'm assuming that the test is not itself being run within a transaction, if it were to be run within a transaction then all calls to save will run within that transaction since the the default transaction propagation is Propagation.REQUIRED, that means if there is a transaction already open the calls will be run within it. If your planning to use spring data I strongly recommend that you read about transaction management in Spring.

相关文章