mysqli::multi_query 是否比多个单个查询更有效?

2021-12-25 00:00:00 php mysqli

有人在回答中提到了 MySQLi multi_query 函数,声称这比循环遍历 3 个单独的查询要好.我试图在谷歌上搜索一些答案,但没有真正得到任何真正满足我好奇心的东西,所以我希望你们能更好地了解使用它的原因,而不是节省几行代码.

Someone brought up the MySQLi multi_query function in an answer claiming that it would be better than looping through 3 separate queries. I tried to Google some sort of answer to this but didn't really get anything that really satisfied my curiosity so I hope you guys may have better insights into the reasons for using it rather than saving a few lines of code.

所以这就是我想知道的:

So here's what I am wondering:

  1. multi_query 在底层做了什么?
  2. multi_query 是否只是简单地访问服务器 x 次并聚合结果?
  3. 是否存在单个查询可能比多个查询更有效的情况?

我知道为每个 100 万个项目访问数据库 3 次并将其粉碎成一个巨大的对象不利于内存使用,但我知道它的存在必须有一个原因,我也确信有是应该避免的时候.我希望能更好地理解它,以便在需要时将其放入我的技巧包中.

I know that hitting the database 3 times for a million items each and smashing it into one huge object isn't good for memory usage, but I know that there has to be a reason for its existence and I am also sure there are times when it should be avoided. I hope to better understand it so that I can put that into my bag of tricks when the need arises.

感谢您的时间!

推荐答案

  • multi_query 到底在做什么? - 只需将所有查询一次发送到服务器,而不是一次发送一个,并一次性检索所有结果.没有比这更复杂的了.

    • What is multi_query doing under the hood? - Just sending all the queries to the server at once instead of one at a time, and retrieving all the results in one go. Nothing more complicated than that.

      multi_query 是否只是简单地点击服务器 x 次并聚合结果? - 它点击"服务器两次 - 一次发送查询,一次检索结果.

      Does multi_query simply hit the server x number of times and aggregates the results? - It "hits" the server twice - once to send the queries and once to retrieve the results.

      是否存在单个查询可能比多个查询更有效的情况? - 取决于您如何定义高效".multi_query() 在网络上很轻,但内存很重,在循环中运行 query() 反之亦然.

      Is there a case where single queries may be more efficient than multiple queries? - depends how you define "efficient". multi_query() is light on the network but memory heavy, running query() in a loop vice versa.

      对于许多返回大型结果集的 SELECT 语句,内存消耗的损失可能大大超过网络方面的收益,并且大多数时候您最好发出查询并一次处理一个结果集 - 尽管这完全取决于您对数据的处理方式.但是如果你需要运行很多 UPDATE 语句,很可能 multi_query() 会更好,因为返回值只是成功/失败并且内存消耗会很轻.

      For many SELECT statements that return large result sets the loss in memory consumption is likely to vastly outweigh the gain in terms of the network and most of the time you'd be better to issue the queries and process the result sets one at a time - although this depends on exactly what you are doing with the data. But if you needed to run many UPDATE statements, it is likely that multi_query() would be better since the return values are just success/fail and the memory consumption will be light.

      您必须权衡所有因素,例如您在做什么、预计需要多长时间、(数据库)服务器和客户端之间的网络延迟、服务器上的可用资源(主要是内存)和客户等,等等......并根据具体情况进行处理.

      You would have to weigh up all the factors like what you are doing, how long you expect it to take, the network latency between the (database) server and client, the available resources (mostly memory) on the server and the client, etc, etc... and take it on a case by case basis.

      我发现这个一些性能测试的记录做了一个不久前,结论是使用 multi_query() 可以提高整体效率.然而测试用例只是运行 4 个查询,每个查询SELECT一个结果,更高效"的定义就是更快".没有针对大量查询或更大结果集的测试,虽然速度很重要,但它并不是最重要的 - 如果我给它无限量的内存,我可以使任何东西运行得非常快,但是尝试同时做任何事情都会失败.这也不是真实世界的测试,因为最终结果可以通过单个 JOIN ed 查询来实现.不过,它确实会带来一些有趣的阅读.

      I found this record of some performance testing done a while ago, where the conclusion is that there is an overall efficiency gain to be found from using multi_query(). However the test case was simply running 4 queries, each one SELECTing a single result, and the definition of "more efficient" is simply "faster". There is no testing for larger numbers of queries or larger result sets and, while speed is important, it is not the be-all and end-all - I can make anything run incredibly fast if I give it an unlimited amount of memory, but an attempt to do anything concurrently will fail miserably. It is also not a real world test, since the end result could be achieved with a single JOINed query. It does make for some interesting read though.

      我个人觉得这有点学术性,因为如果您一次运行大量语句,则 90% 的时间它们只会在传递的数据和查询结构方面有所不同将保持不变 - 这是准备好的语句的明显候选.

      Personally I feel this is somewhat academic because if you are running a large set of statements at once, 90% of the time they will vary only in the data that is being passed and the query structure will remain the same - which is an obvious candidate for prepared statements.

相关文章