plotly sankey 图数据格式
问题描述
plotly 库有一些漂亮的 sankey 图
plotly library has some nice sankey diagrams https://plotly.com/python/sankey-diagram/
but the data requires you to pass indexes of the source/target pairs.
link = dict(
source = [0, 1, 0, 2, 3, 3], # indices correspond to labels, eg A1, A2, A1, B1, ...
target = [2, 3, 3, 4, 4, 5],
I was wondering if there's an API to simply pass a named list of these pairs?
links = [
{'source': 'start', 'target': 'A', 'value': 2},
{'source': 'A', 'target': 'B', 'value': 2},
...
]
this is more inline with how bokeh/holoviews expects data (but that sankey doesn't work with self-loops)
and also this pysankey widget
so i can closer map to my dataframe without processing everything?
or, is there a nice pythonic way to convert this in a one liner :D
解决方案- the structure is clearly a pandas dataframe constructor format
- create a dataframe from it, plus the key series of the nodes
- from this it's simple to construct a Sankey plot from it
import pandas as pd
import numpy as np
import plotly.graph_objects as go
links = [
{'source': 'start', 'target': 'A', 'value': 2},
{'source': 'A', 'target': 'B', 'value': 1},
{'source': 'A', 'target':'C', 'value':.5}
]
df = pd.DataFrame(links)
nodes = np.unique(df[["source","target"]], axis=None)
nodes = pd.Series(index=nodes, data=range(len(nodes)))
go.Figure(
go.Sankey(
node={"label": nodes.index},
link={
"source": nodes.loc[df["source"]],
"target": nodes.loc[df["target"]],
"value": df["value"],
},
)
)
相关文章