Python: 带有复杂数据的桑基图表图

huangapple go评论57阅读模式
英文:

Python: Sankey plot chart with complex data

问题

我有以下数据集:

data = {
    '1': ['A', 'B', 'C', 'NAN', 'A', 'C', 'NAN', 'C', 'B', 'A'],
    '2': ['B', 'NAN', 'A', 'B', 'C', 'A', 'B', 'NAN', 'A', 'C'],
    '3': ['NAN', 'A', 'B', 'C', 'NAN', 'B', 'A', 'B', 'C', 'A'],
    '4': ['C', 'B', 'NAN', 'A', 'B', 'NAN', 'C', 'A', 'NAN', 'B']
}
df = pd.DataFrame(data)

我想要绘制这个数据结构的简单桑基图。我甚至不知道从哪里开始...

英文:

I have the following dataset:

data = {
    '1': ['A', 'B', 'C', 'NAN', 'A', 'C', 'NAN', 'C', 'B', 'A'],
    '2': ['B', 'NAN', 'A', 'B', 'C', 'A', 'B', 'NAN', 'A', 'C'],
    '3': ['NAN', 'A', 'B', 'C', 'NAN', 'B', 'A', 'B', 'C', 'A'],
    '4': ['C', 'B', 'NAN', 'A', 'B', 'NAN', 'C', 'A', 'NAN', 'B']
}
df = pd.DataFrame(data)

and I want to perform a simple Sankey plot of this data structure. I dont even know where to start...

答案1

得分: 1

以下是翻译好的内容:

# 获取数据的正确格式可能有些棘手。也许有比我现在想出的更高效的方法,但希望这能完成任务。

import plotly.graph_objects as go
import pandas as pd
import numpy as np
data = {
    '1': ['A', 'B', 'C', 'NAN', 'A', 'C', 'NAN', 'C', 'B', 'A'],
    '2': ['B', 'NAN', 'A', 'B', 'C', 'A', 'B', 'NAN', 'A', 'C'],
    '3': ['NAN', 'A', 'B', 'C', 'NAN', 'B', 'A', 'B', 'C', 'A'],
    '4': ['C', 'B', 'NAN', 'A', 'B', 'NAN', 'C', 'A', 'NAN', 'B']
}
df = pd.DataFrame(data)
df = df.replace('NAN', np.nan)

# 通过将列名添加到单元格值中,获取标签的列表,然后获取唯一的组合
label = sorted(df.apply(lambda x: x + x.name).melt().dropna()['value'].unique())

# 遍历两列,映射出关系
output = []
for i in range(1, df.shape[1]):
    output.extend(df[[str(i), str(i+1)]].value_counts().reset_index().apply(lambda x: x + x.name).values)

# 将关系转换为标签列表的索引
mapped = []
for x in output:
    mapped.append((label.index(x[0]), label.index(x[1]), x[2))

# 将值拆分为相应的桶
source, target, value = np.array(mapped).T

# 构建你的图表
fig = go.Figure(data=)

fig.update_layout(title_text="基本桑基图", font_size=10)
fig.show()

输出:
Python: 带有复杂数据的桑基图表图


<details>
<summary>英文:</summary>

It is tricky to get the data in the correct shape.  Perhaps there is a more efficient way than what I have come up with, but hopefully this gets the job done.

    import plotly.graph_objects as go
    import pandas as pd
    import numpy as np
    data = {
        &#39;1&#39;: [&#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;NAN&#39;, &#39;A&#39;, &#39;C&#39;, &#39;NAN&#39;, &#39;C&#39;, &#39;B&#39;, &#39;A&#39;],
        &#39;2&#39;: [&#39;B&#39;, &#39;NAN&#39;, &#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;A&#39;, &#39;B&#39;, &#39;NAN&#39;, &#39;A&#39;, &#39;C&#39;],
        &#39;3&#39;: [&#39;NAN&#39;, &#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;NAN&#39;, &#39;B&#39;, &#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;A&#39;],
        &#39;4&#39;: [&#39;C&#39;, &#39;B&#39;, &#39;NAN&#39;, &#39;A&#39;, &#39;B&#39;, &#39;NAN&#39;, &#39;C&#39;, &#39;A&#39;, &#39;NAN&#39;, &#39;B&#39;]
    }
    df = pd.DataFrame(data)
    df = df.replace(&#39;NAN&#39;, np.nan)
    
    # Get a list of labels by adding the column name to the the cell values and
    # getting the discict combinations
    label  = sorted(df.apply(lambda x: x+x.name).melt().dropna()[&#39;value&#39;].unique())
    
    # Iterate over two columns at a time to map out the relationships
    output = []
    for i in range(1, df.shape[1]):
        output.extend(df[[str(i),str(i+1)]].value_counts().reset_index().apply(lambda x: x+x.name).values)
    
    # Convert the relationships to the index of the labels list
    mapped = []
    for x in output:
        mapped.append((label.index(x[0]), label.index(x[1]), x[2]))
    
    # Split the values into their corresponding buckets
    source, target, value = np.array(mapped).T
    
    # Build your chart
    fig = go.Figure(data=)
    
    fig.update_layout(title_text=&quot;Basic Sankey Diagram&quot;, font_size=10)
    fig.show()

Output

[![enter image description here][1]][1]


  [1]: https://i.stack.imgur.com/LBYzN.png

</details>



huangapple
  • 本文由 发表于 2023年3月10日 00:16:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/75687278.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定