MySQL:按块检索大选择

2021-12-19 00:00:00 select save mysql

我选择了更多

7000 万行

我想在win2012 R2上将选择的数据保存到一个大的csv文件中

I'd like to save the selected data into the one large csv file on win2012 R2

问:如何通过 chanks 从 MySQL 中检索数据以获得更好的性能?

Q: How to retrive the data from MySQL by chanks for better performance ?

因为当我尝试保存一个大选择时,我得到了

because when I try to save one the large select I got

内存不足错误

推荐答案

您可以尝试使用 LIMIT 功能.如果你这样做:

You could try using the LIMIT feature. If you do this:

SELECT * FROM MyTable ORDER BY whatever LIMIT 0,1000

您将获得前 1,000 行.第一个 LIMIT 值 (0) 定义结果集中的起始行.它是零索引的,所以 0 表示第一行".第二个 LIMIT 值是要检索的最大行数.要获得接下来的几组 1,000,请执行以下操作:

You'll get the first 1,000 rows. The first LIMIT value (0) defines the starting row in the result set. It's zero-indexed, so 0 means "the first row". The second LIMIT value is the maximum number of rows to retrieve. To get the next few sets of 1,000, do this:

SELECT * FROM MyTable ORDER BY whatever LIMIT 1000,1000 -- rows 1,001 - 2,000
SELECT * FROM MyTable ORDER BY whatever LIMIT 2000,1000 -- rows 2,001 - 3,000

等等.当 SELECT 不返回任何行时,您就完成了.

And so on. When the SELECT returns no rows, you're done.

但这本身还不够,因为在您一次处理 1K 行时对表所做的任何更改都会破坏订单.要及时冻结结果,首先将结果查询到临时表中:

This isn't enough on its own though, because any changes done to the table while you're processing your 1K rows at a time will throw off the order. To freeze the results in time, start by querying the results into a temporary table:

CREATE TEMPORARY TABLE MyChunkedResult AS (
  SELECT *
  FROM MyTable
  ORDER BY whatever
);

旁注:最好事先确保临时表不存在:

Side note: it's a good idea to make sure the temporary table doesn't exist beforehand:

DROP TEMPORARY TABLE IF EXISTS MyChunkedResult;

无论如何,一旦临时表就位,就从那里拉出行块:

At any rate, once the temporary table is in place, pull the row chunks from there:

SELECT * FROM MyChunkedResult LIMIT 0, 1000;
SELECT * FROM MyChunkedResult LIMIT 1000,1000;
SELECT * FROM MyChunkedResult LIMIT 2000,1000;
.. and so on.

我将留给您创建逻辑,该逻辑将在每个块之后计算极限值并检查结果的结尾.我还建议使用比 1,000 条记录大得多的数据块;这只是我从空中挑选出来的一个数字.

I'll leave it to you to create the logic that will calculate the limit value after each chunk and check for the end of results. I'd also recommend much larger chunks than 1,000 records; it's just a number I picked out of the air.

最后,最好在完成后删除临时表:

Finally, it's good form to drop the temporary table when you're done:

DROP TEMPORARY TABLE MyChunkedResult;

相关文章