在 pandas 的分组条上绘制误差条
问题描述
我可以在单个系列柱状图上绘制误差条,如下所示:
import pandas as pd
df = pd.DataFrame([[4,6,1,3], [5,7,5,2]], columns = ['mean1', 'mean2', 'std1', 'std2'], index=['A', 'B'])
print(df)
mean1 mean2 std1 std2
A 4 6 1 3
B 5 7 5 2
df['mean1'].plot(kind='bar', yerr=df['std1'], alpha = 0.5,error_kw=dict(ecolor='k'))
不出所料,指数A的平均值与同一指数的标准差成对出现,误差条显示该值的+/-。
但是,当我尝试在同一绘图中同时绘制‘均值1’和‘均值2’时,我不能以相同的方式使用标准差:
df[['mean1', 'mean2']].plot(kind='bar', yerr=df[['std1', 'std2']], alpha = 0.5,error_kw=dict(ecolor='k'))
Traceback (most recent call last):
File "<ipython-input-587-23614d88a3c5>", line 1, in <module>
df[['mean1', 'mean2']].plot(kind='bar', yerr=df[['std1', 'std2']], alpha = 0.5,error_kw=dict(ecolor='k'))
File "C:Users
ameDropboxToolsWinPython-64bit-2.7.6.2python-2.7.6.amd64libsite-packagespandas oolsplotting.py", line 1705, in plot_frame
plot_obj.generate()
File "C:Users
ameDropboxToolsWinPython-64bit-2.7.6.2python-2.7.6.amd64libsite-packagespandas oolsplotting.py", line 878, in generate
self._make_plot()
File "C:Users
ameDropboxToolsWinPython-64bit-2.7.6.2python-2.7.6.amd64libsite-packagespandas oolsplotting.py", line 1534, in _make_plot
start=start, label=label, **kwds)
File "C:Users
ameDropboxToolsWinPython-64bit-2.7.6.2python-2.7.6.amd64libsite-packagespandas oolsplotting.py", line 1481, in f
return ax.bar(x, y, w, bottom=start,log=self.log, **kwds)
File "C:Users
ameDropboxToolsWinPython-64bit-2.7.6.2python-2.7.6.amd64libsite-packagesmatplotlibaxes.py", line 5075, in bar
fmt=None, **error_kw)
File "C:Users
ameDropboxToolsWinPython-64bit-2.7.6.2python-2.7.6.amd64libsite-packagesmatplotlibaxes.py", line 5749, in errorbar
iterable(yerr[0]) and iterable(yerr[1])):
File "C:Users
ameDropboxToolsWinPython-64bit-2.7.6.2python-2.7.6.amd64libsite-packagespandascoreframe.py", line 1635, in __getitem__
return self._getitem_column(key)
File "C:Users
ameDropboxToolsWinPython-64bit-2.7.6.2python-2.7.6.amd64libsite-packagespandascoreframe.py", line 1642, in _getitem_column
return self._get_item_cache(key)
File "C:Users
ameDropboxToolsWinPython-64bit-2.7.6.2python-2.7.6.amd64libsite-packagespandascoregeneric.py", line 983, in _get_item_cache
values = self._data.get(item)
File "C:Users
ameDropboxToolsWinPython-64bit-2.7.6.2python-2.7.6.amd64libsite-packagespandascoreinternals.py", line 2754, in get
_, block = self._find_block(item)
File "C:Users
ameDropboxToolsWinPython-64bit-2.7.6.2python-2.7.6.amd64libsite-packagespandascoreinternals.py", line 3065, in _find_block
self._check_have(item)
File "C:Users
ameDropboxToolsWinPython-64bit-2.7.6.2python-2.7.6.amd64libsite-packagespandascoreinternals.py", line 3072, in _check_have
raise KeyError('no item named %s' % com.pprint_thing(item))
KeyError: u'no item named 0'
我最接近我想要的输出是这样的:
df[['mean1', 'mean2']].plot(kind='bar', yerr=df[['std1', 'std2']].values.T, alpha = 0.5,error_kw=dict(ecolor='k'))
但现在误差条不是对称绘制的。取而代之的是,每个系列中的绿色和模糊条使用相同的正负误差,这就是我被卡住的地方。如何才能使我的多系列条形图的错误条具有与我只有一个系列时类似的外观?
更新: 这似乎是在pandas 0.14中修复的,我之前阅读的是0.13版本的文档。不过,我现在没有可能升级我的 pandas 。稍后再做,看看结果如何。
解决方案
- 操作中的
yerr=df[['std1', 'std2']]
不起作用,因为列名与df[['mean1', 'mean2']]
中的不同- 将值作为数据帧传递给
yerr
时,列名必须与数据列相同(例如mean1
和mean2
) - 参见Adding error bars to grouped bar plot in pandas
- 将值作为数据帧传递给
- 使用
df[['std1', 'std2']].to_numpy().T
可以通过传递没有命名列的错误数组来绕过该问题 - 测试于
python 3.8.11
、pandas 1.3.3
、matplotlib 3.4.3
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame([[4,6,1,3], [5,7,5,2]], columns = ['mean1', 'mean2', 'std1', 'std2'], index=['A', 'B'])
mean1 mean2 std1 std2
A 4 6 1 3
B 5 7 5 2
# convert the std columns to an array
yerr = df[['std1', 'std2']].to_numpy().T
# print(yerr)
array([[1, 5],
[3, 2]], dtype=int64)
df[['mean1', 'mean2']].plot(kind='bar', yerr=yerr, alpha=0.5, error_kw=dict(ecolor='k'))
plt.show()
相关文章