将 IBM_DB 与 Pandas 一起使用

2022-01-14 00:00:00 python pandas db2

问题描述

我正在尝试使用 Python 语言的数据分析工具 Pandas.我正在尝试使用 ibm_db 包从 IBM DB 读取数据.根据 Pandas 网站上的文档,我们需要提供至少 2 个参数,一个是要执行的 sql,另一个是数据库的连接对象.但是当我这样做时,它给了我连接对象中没有 cursor() 方法的错误.我想也许这不是这个特定的 DB 包的工作方式.我试图找到一些解决方法,但没有成功.

I am trying to use the data analysis tool Pandas in Python Language. I am trying to read data from a IBM DB, using ibm_db package. According to the documentation in Pandas website we need to provide at least 2 arguments, one would be the sql that would be executed and other would be the connection object of the database. But when i do that, it gives me error that the connection object does not have a cursor() method in it. I figured maybe this is not how this particular DB Package worked. I tried to find a few workarounds but was not successfull.

代码:

print "hello PyDev"
con = db.connect("DATABASE=db;HOSTNAME=localhost;PORT=50000;PROTOCOL=TCPIP;UID=admin;PWD=admin;", "", "")
sql = "select * from Maximo.PLUSPCUSTOMER"
stmt = db.exec_immediate(con,sql)
pd.read_sql(sql, db)
print "done here"

错误:

hello PyDev
Traceback (most recent call last):
  File "C:UsersayworkspaceFirstprojectpack	est.py", line 15, in <module>
    pd.read_sql(sql, con)
  File "D:etllibsite-packagespandasiosql.py", line 478, in read_sql
    chunksize=chunksize)
  File "D:etllibsite-packagespandasiosql.py", line 1504, in read_query
    cursor = self.execute(*args)
  File "D:etllibsite-packagespandasiosql.py", line 1467, in execute
    cur = self.con.cursor()
AttributeError: 'ibm_db.IBM_DBConnection' object has no attribute 'cursor'

如果我从数据库中获取数据,我可以获取数据,但我需要读入数据帧并需要在处理数据后写回数据库.

I am able to fetch data if i fetch it from the database but i need to read into a dataframe and need to write back to the database after processing data.

从数据库中获取的代码

stmt = db.exec_immediate(con,sql)
 tpl=db.fetch_tuple(stmt)
 while tpl:
     print(tpl)
     tpl=db.fetch_tuple(stmt)


解决方案

在进一步研究包的过程中,我发现我需要将IBM_DB连接对象包装在一个ibm_db_dbi连接对象中,这是https://pypi.org/project/ibm-db/ 包.

On doing further studying the package, i found that I need to wrap the IBM_DB connection object in a ibm_db_dbi connection object, which is part of the https://pypi.org/project/ibm-db/ package.

所以

conn = ibm_db_dbi.Connection(con)
df = pd.read_sql(sql, conn)

上述代码有效,pandas 成功将数据提取到数据帧中.

The above code works and pandas fetches data into dataframe successfully.

相关文章