阻止 Pandas 将 int 转换为 float
问题描述
我有一个 DataFrame
.以下是两个相关的列:一个是 int
的列,另一个是 str
的列.
I have a DataFrame
. Two relevant columns are the following: one is a column of int
and another is a column of str
.
我知道如果我将 NaN
插入 int
列,Pandas 会将所有 int
转换为 float
因为 int
没有 NaN
值.
I understand that if I insert NaN
into the int
column, Pandas will convert all the int
into float
because there is no NaN
value for an int
.
但是,当我将 None
插入 str
列时,Pandas 会将我的所有 int
转换为 float
为好.这对我来说没有意义 - 为什么我在第 2 列中输入的值会影响第 1 列?
However, when I insert None
into the str
column, Pandas converts all my int
to float
as well. This doesn't make sense to me - why does the value I put in column 2 affect column 1?
这是一个简单的工作示例(Python 2):
Here's a simple working example (Python 2):
import pandas as pd
df = pd.DataFrame()
df["int"] = pd.Series([], dtype=int)
df["str"] = pd.Series([], dtype=str)
df.loc[0] = [0, "zero"]
print df
print
df.loc[1] = [1, None]
print df
输出是
int str
0 0 zero
int str
0 0.0 zero
1 1.0 NaN
有没有办法让输出如下:
Is there any way to make the output the following:
int str
0 0 zero
int str
0 0 zero
1 1 NaN
不将第一列重铸为 int
.
我更喜欢使用
int
而不是float
因为实际数据在该列是整数.如果没有解决方法,我只会使用float
.
I prefer using
int
instead offloat
because the actual data in that column are integers. If there's not workaround, I'll just usefloat
though.
我不喜欢重铸,因为在我的实际代码中,我不需要
存储实际的dtype
.
I prefer not having to recast because in my actual code, I don't
store the actual dtype
.
我还需要逐行插入数据.
I also need the data inserted row-by-row.
解决方案
如果你设置dtype=object
,你的系列就可以包含任意数据类型:
If you set dtype=object
, your series will be able to contain arbitrary data types:
df["int"] = pd.Series([], dtype=object)
df["str"] = pd.Series([], dtype=str)
df.loc[0] = [0, "zero"]
print(df)
print()
df.loc[1] = [1, None]
print(df)
int str
0 0 zero
1 NaN NaN
int str
0 0 zero
1 1 None
相关文章