如何使用牛郎星来标注Transform_Regregation中的直线?

2022-03-24 00:00:00 python altair legend linear-regression

问题描述

下面的代码创建了一条回归线;但是,图例默认将该线标记为";未定义。";如何在图例中将此回归线标记为";reg-line";?

import altair as alt
from vega_datasets import data
import pandas as pd

source = data.anscombe().copy()
source['line-label'] = 'x=y'
source = pd.concat([source,source.groupby('Series').agg(x_diff=('X','diff'), y_diff=('Y','diff'))],axis=1)
source['rate'] = source.y_diff/source.x_diff
source['rate-label'] = 'line y=x'

scatter = alt.Chart(source).mark_circle(size=60, opacity=0.60).encode(
    x='X:Q',
    y='Y:Q',
    color='Series:N',
    tooltip=['X','Y','rate']
)

scatter = scatter + scatter.transform_regression('X', 'Y').mark_line(opacity=0.50, shape='mark')

chart = scatter.facet(
    columns=2
    , facet=alt.Facet('Series',header=alt.Header(labelFontSize=25))
).resolve_scale(
    x='independent',
    y='independent'
)

chart.display()


解决方案

只需在标记行后添加.transform_fold(["reg-line"], as_=["Regression", "y"]).encode(alt.Color("Regression:N"))

代码应如下所示

import altair as alt
from vega_datasets import data
import pandas as pd

source = data.anscombe().copy()
source['line-label'] = 'x=y'
source = pd.concat([source,source.groupby('Series').agg(x_diff=('X','diff'), y_diff=('Y','diff'))],axis=1)
source['rate'] = source.y_diff/source.x_diff
source['rate-label'] = 'line y=x'

scatter = alt.Chart(source).mark_circle(size=60, opacity=0.60).encode(
    x='X:Q',
    y='Y:Q',
    color='Series:N',
    tooltip=['X','Y','rate']
)

scatter = scatter + scatter.transform_regression('X', 'Y').mark_line(
     opacity=0.50, 
     shape='mark'
).transform_fold(
     ["reg-line"], 
     as_=["Regression", "y"]
).encode(alt.Color("Regression:N"))
chart = scatter.facet(
    columns=2
    , facet=alt.Facet('Series',header=alt.Header(labelFontSize=25))
).resolve_scale(
    x='independent',
    y='independent'
)

chart.display()

相关文章