根据跨越边界的数量,突出显示超过或低于阈值的 matplotlib 点
问题描述
我有一个看起来像这样的图表:
I have a graph that looks like this:
我正在运行以获取此图(8 个图序列中的一个)的代码如下:
And the code I'm running to get this graph (one of a sequence of 8 graphs) is below:
date_list = list(df_testing_set['date'].unique())
random_date_list = list(np.random.choice(date_list,8))
df_new = df_testing_set[df_testing_set['date'].isin(random_date_list)]
for date1 in random_date_list:
df_new = df_testing_set[df_testing_set['date'] == date1]
title = date1
if df_new.iloc[0]['day'] in ['Saturday', 'Sunday']:
df_shader = df_result_weekend.copy()
title += " - Weekend"
else:
df_shader = df_result_weekday.copy()
title += " - Weekday"
y = df_new[row_index].tolist()
x = range(0, len(y))
x_axis = buckets
y_axis = df_shader.loc[df_shader.index.isin([row_index]) & df_shader['Bucket'].between(1, 144), data_field].tolist()
del y_axis[-1]
plt.title(title)
plt.xlabel("Time of Day (10m Intervals)")
plt.ylabel(data_field + " values for " + row_index)
standevs = df_shader.loc[df_shader.index.isin([row_index]) & df_shader['Bucket'].between(1, 144), 'StanDev'].tolist()
del standevs[-1]
lower_bound = np.array(y_axis) - np.array(standevs)
upper_bound = np.array(y_axis) + np.array(standevs)
plt.fill_between(x_axis, lower_bound, upper_bound, facecolor='lightblue')
#highlighting anomalies
# if (y > upper_bound | y < lower_bound):
# plt.plot(x,y, 'rx')
# else:
# plt.plot(x, y)
plt.plot(x,y)
plt.show()
del df_shader, title, date1, df_new
我正在尝试创建一个条件(如注释的 if 语句),以便当绘制的坐标高于阈值 upper_bound
或低于 lower_bound
时,点是标有不同颜色的x".我希望最终拥有它,如果一个点超过阈值 1 个标准偏差,它将被标记为橙色,如果超过 2 个或更多标准偏差,它将被标记为红色.我在 StanDev
列下的数据框 df_shader
中有所有标准偏差.每当我尝试运行 if 块的某些变体时,都会出现变量错误和名称错误
I'm trying to create a condition (like the commented if statement) such that when the plotted coordinates go above the threshold upper_bound
or below lower_bound
, the points are marked with an 'x' in different colors. I want to in the end have it such that if a point exceeds the threshold by 1 standard deviation, it will be marked in orange, and if it exceeds by 2 or more standard deviations, it will be marked in red. I have all the standard deviations in the data frame df_shader
under the column StanDev
. Whenever I try to run some variation of the if-block, I get variable errors and name errors
解决方案
您可以使用布尔掩码选择满足某些条件的点,并绘制它们:
You can use boolean masks to select points that fulfill certain conditions, and plot them:
import matplotlib.pyplot as plt
import numpy as np
std = 0.1
N = 100
x = np.linspace(0, 1, N)
expected_y = np.sin(2 * np.pi * x)
y = expected_y + np.random.normal(0, std, N)
dist = np.abs(y - expected_y) / std
mask1 = (1 < dist) & (dist <= 2)
mask2 = dist > 2
plt.fill_between(x, expected_y - 0.1, expected_y + 0.1, alpha=0.1)
plt.fill_between(x, expected_y - 0.2, expected_y + 0.2, alpha=0.1)
plt.plot(x, y)
plt.plot(x[mask1], y[mask1], 'x')
plt.plot(x[mask2], y[mask2], 'x')
plt.tight_layout()
plt.savefig('mp_points.png', dpi=300)
结果:
相关文章