Python: 带有复杂数据的桑基图表图

huangapple go评论101阅读模式
英文:

Python: Sankey plot chart with complex data

问题

我有以下数据集:

  1. data = {
  2. '1': ['A', 'B', 'C', 'NAN', 'A', 'C', 'NAN', 'C', 'B', 'A'],
  3. '2': ['B', 'NAN', 'A', 'B', 'C', 'A', 'B', 'NAN', 'A', 'C'],
  4. '3': ['NAN', 'A', 'B', 'C', 'NAN', 'B', 'A', 'B', 'C', 'A'],
  5. '4': ['C', 'B', 'NAN', 'A', 'B', 'NAN', 'C', 'A', 'NAN', 'B']
  6. }
  7. df = pd.DataFrame(data)

我想要绘制这个数据结构的简单桑基图。我甚至不知道从哪里开始...

英文:

I have the following dataset:

  1. data = {
  2. '1': ['A', 'B', 'C', 'NAN', 'A', 'C', 'NAN', 'C', 'B', 'A'],
  3. '2': ['B', 'NAN', 'A', 'B', 'C', 'A', 'B', 'NAN', 'A', 'C'],
  4. '3': ['NAN', 'A', 'B', 'C', 'NAN', 'B', 'A', 'B', 'C', 'A'],
  5. '4': ['C', 'B', 'NAN', 'A', 'B', 'NAN', 'C', 'A', 'NAN', 'B']
  6. }
  7. df = pd.DataFrame(data)

and I want to perform a simple Sankey plot of this data structure. I dont even know where to start...

答案1

得分: 1

以下是翻译好的内容:

  1. # 获取数据的正确格式可能有些棘手。也许有比我现在想出的更高效的方法,但希望这能完成任务。
  2. import plotly.graph_objects as go
  3. import pandas as pd
  4. import numpy as np
  5. data = {
  6. '1': ['A', 'B', 'C', 'NAN', 'A', 'C', 'NAN', 'C', 'B', 'A'],
  7. '2': ['B', 'NAN', 'A', 'B', 'C', 'A', 'B', 'NAN', 'A', 'C'],
  8. '3': ['NAN', 'A', 'B', 'C', 'NAN', 'B', 'A', 'B', 'C', 'A'],
  9. '4': ['C', 'B', 'NAN', 'A', 'B', 'NAN', 'C', 'A', 'NAN', 'B']
  10. }
  11. df = pd.DataFrame(data)
  12. df = df.replace('NAN', np.nan)
  13. # 通过将列名添加到单元格值中,获取标签的列表,然后获取唯一的组合
  14. label = sorted(df.apply(lambda x: x + x.name).melt().dropna()['value'].unique())
  15. # 遍历两列,映射出关系
  16. output = []
  17. for i in range(1, df.shape[1]):
  18. output.extend(df[[str(i), str(i+1)]].value_counts().reset_index().apply(lambda x: x + x.name).values)
  19. # 将关系转换为标签列表的索引
  20. mapped = []
  21. for x in output:
  22. mapped.append((label.index(x[0]), label.index(x[1]), x[2))
  23. # 将值拆分为相应的桶
  24. source, target, value = np.array(mapped).T
  25. # 构建你的图表
  26. fig = go.Figure(data=)
  27. fig.update_layout(title_text="基本桑基图", font_size=10)
  28. fig.show()

输出:
Python: 带有复杂数据的桑基图表图

  1. <details>
  2. <summary>英文:</summary>
  3. It is tricky to get the data in the correct shape. Perhaps there is a more efficient way than what I have come up with, but hopefully this gets the job done.
  4. import plotly.graph_objects as go
  5. import pandas as pd
  6. import numpy as np
  7. data = {
  8. &#39;1&#39;: [&#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;NAN&#39;, &#39;A&#39;, &#39;C&#39;, &#39;NAN&#39;, &#39;C&#39;, &#39;B&#39;, &#39;A&#39;],
  9. &#39;2&#39;: [&#39;B&#39;, &#39;NAN&#39;, &#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;A&#39;, &#39;B&#39;, &#39;NAN&#39;, &#39;A&#39;, &#39;C&#39;],
  10. &#39;3&#39;: [&#39;NAN&#39;, &#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;NAN&#39;, &#39;B&#39;, &#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;A&#39;],
  11. &#39;4&#39;: [&#39;C&#39;, &#39;B&#39;, &#39;NAN&#39;, &#39;A&#39;, &#39;B&#39;, &#39;NAN&#39;, &#39;C&#39;, &#39;A&#39;, &#39;NAN&#39;, &#39;B&#39;]
  12. }
  13. df = pd.DataFrame(data)
  14. df = df.replace(&#39;NAN&#39;, np.nan)
  15. # Get a list of labels by adding the column name to the the cell values and
  16. # getting the discict combinations
  17. label = sorted(df.apply(lambda x: x+x.name).melt().dropna()[&#39;value&#39;].unique())
  18. # Iterate over two columns at a time to map out the relationships
  19. output = []
  20. for i in range(1, df.shape[1]):
  21. output.extend(df[[str(i),str(i+1)]].value_counts().reset_index().apply(lambda x: x+x.name).values)
  22. # Convert the relationships to the index of the labels list
  23. mapped = []
  24. for x in output:
  25. mapped.append((label.index(x[0]), label.index(x[1]), x[2]))
  26. # Split the values into their corresponding buckets
  27. source, target, value = np.array(mapped).T
  28. # Build your chart
  29. fig = go.Figure(data=)
  30. fig.update_layout(title_text=&quot;Basic Sankey Diagram&quot;, font_size=10)
  31. fig.show()
  32. Output
  33. [![enter image description here][1]][1]
  34. [1]: https://i.stack.imgur.com/LBYzN.png
  35. </details>

huangapple
  • 本文由 发表于 2023年3月10日 00:16:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/75687278.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定