将快速 pandas 数据帧写入 postgres

2022-01-20 00:00:00 python dataframe 复制 postgresql

问题描述

我想知道将数据从 pandas DataFrame 写入 postges DB 表的最快方法.

I wonder of the fastest way to write data from pandas DataFrame to table in postges DB.

1)我试过pandas.to_sql,但由于某种原因它需要实体来复制数据,

1) I've tried pandas.to_sql, but for some reason it takes entity to copy data,

2) 除了我尝试过以下操作:

2) besides I've tried following:

import io
f = io.StringIO()
pd.DataFrame({'a':[1,2], 'b':[3,4]}).to_csv(f)
cursor = conn.cursor()
cursor.execute('create table bbbb (a int, b int);COMMIT; ')
cursor.copy_from(f, 'bbbb', columns=('a', 'b'), sep=',')
cursor.execute("select * from bbbb;")
a = cursor.fetchall()
print(a)
cursor.close()

但它返回空列表[].

所以我有两个问题:将数据从 python 代码(数据帧)复制到 postgres DB 的最快方法是什么?我尝试过的第二种方法有什么不正确的地方?

So I have two questions: what is the fastest way to copy data from python code (dataframe) to postgres DB? and what was incorrect in the second approach that I've tried?


解决方案

您的第二种方法应该非常快.

Your second approach should be very fast.

你的代码有两个问题:

  1. 将 csv 写入 f 后,您将位于文件末尾.在开始阅读之前,您需要将位置放回到开头.
  2. 写csv时,需要省略header和index
  1. After writing the csv to f you are positioned at the end of the file. You need to put your position back to the beginning before starting to read.
  2. When writing a csv, you need to omit the header and index

你的最终代码应该是这样的:

Here is what your final code should look like:

import io
f = io.StringIO()
pd.DataFrame({'a':[1,2], 'b':[3,4]}).to_csv(f, index=False, header=False)  # removed header
f.seek(0)  # move position to beginning of file before reading
cursor = conn.cursor()
cursor.execute('create table bbbb (a int, b int);COMMIT; ')
cursor.copy_from(f, 'bbbb', columns=('a', 'b'), sep=',')
cursor.execute("select * from bbbb;")
a = cursor.fetchall()
print(a)
cursor.close()

相关文章