将 Pandas DataFrame 切片成新的 DataFrame

2022-01-20 00:00:00 python pandas 复制 slice

问题描述

我想用布尔索引对 DataFrame 进行切片以获得副本,然后在该副本上独立于原始 DataFrame 执行操作.

I would like to slice a DataFrame with a Boolean index obtaining a copy, and then do stuff on that copy independently of the original DataFrame.

从这个 answer 判断,使用布尔数组使用 .loc 进行选择会给我返回一个副本,但是,如果我尝试更改副本,SettingWithCopyWarning 会妨碍.那么这是否是正确的方法:

Judging from this answer, selecting with .loc using a Boolean array will hand me back a copy, but then, if I try to change the copy, SettingWithCopyWarning gets in the way. Would this then be the correct way:

import numpy as np
import pandas as pd
d1 = pd.DataFrame(np.random.randn(10, 5), columns=['a', 'b', 'c', 'd', 'e'])
# create a new dataframe from the sliced copy
d2 = pd.DataFrame(d1.loc[d1.a > 1, :])
# do stuff with d2, keep d1 unchanged


解决方案

你需要copyboolean indexing,不需要新的DataFrame构造函数:

d2 = d1[d1.a > 1].copy()

警告说明:

如果您稍后修改 d2 中的值,您会发现修改不会传播回原始数据 (d1),并且 Pandas 会发出警告.

If you modify values in d2 later you will find that the modifications do not propagate back to the original data (d1), and that Pandas does warning.

相关文章