结合分组条形图和折线图

huangapple go评论111阅读模式
英文:

Combine Binned barplot with lineplot

问题

  1. 我想在同一张图上表示两个数据集,一个作为折线图,另一个作为分组条形图。我可以分别做到:
  2. 但是当我尝试将它们合并时,x轴当然会出现问题:
  3. 我也似乎无法去掉分组标签。
  4. 我该如何在同一张图上展示这两个信息呢?
英文:

I'd like to represent two datasets on the same plot, one as a line as one as a binned barplot. I can do each individually:

  1. tobar = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))
  2. tobar["bins"] = pd.qcut(tobar.index, 20)
  3. bp = sns.barplot(data=tobar, x="bins", y="value")

结合分组条形图和折线图

  1. toline = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))
  2. lp = sns.lineplot(data=toline, x=toline.index, y="value")

结合分组条形图和折线图

But when I try to combine them, of course the x axis gets messed up:

  1. fig, ax = plt.subplots()
  2. ax2 = ax.twinx()
  3. bp = sns.barplot(data=tobar, x="bins", y="value", ax=ax)
  4. lp = sns.lineplot(data=toline, x=toline.index, y="value", ax=ax2)
  5. bp.set(xlabel=None)

结合分组条形图和折线图

I also can't seem to get rid of the bin labels.

How can I get these two informations on the one plot?

答案1

得分: 2

  • 这个答案解释了为什么最好使用 matplotlib.axes.Axes.bar 来绘制柱状图,而不是使用 sns.barplotpandas.DataFrame.bar
    • 简而言之,xtick 的位置对应于标签的实际数值,而 seabornpandas 中的 xticks 是从 0 开始索引的,并不对应实际数值。
  • 这个答案展示了如何添加柱状图的标签。
  • 如果需要,可以对线图使用 ax2 = ax.twinx()
  • 如果线图使用不同的数据也是可以的。
  • python 3.11pandas 1.5.2matplotlib 3.6.2seaborn 0.12.1 中进行了测试

导入和数据框

  1. import pandas as pd
  2. import seaborn as sns
  3. import matplotlib.pyplot as plt
  4. import numpy as np
  5. # 测试数据
  6. np.random.seed(2022)
  7. df = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))
  8. # 创建分箱
  9. df["bins"] = pd.qcut(df.index, 20)
  10. # 添加一个列作为区间的中点
  11. df['mid'] = df.bins.apply(lambda row: row.mid.round().astype(int))
  12. # 透视数据框以计算每个区间的均值
  13. pt = df.pivot_table(index='mid', values='value', aggfunc='mean').reset_index()

绘图 1

  1. # 创建图形
  2. fig, ax = plt.subplots(figsize=(30, 7))
  3. # 在 y=0 处添加水平线
  4. ax.axhline(0, color='black')
  5. # 添加柱状图
  6. ax.bar(data=pt, x='mid', height='value', width=4, alpha=0.5)
  7. # 如果需要,设置 xticks 的标签
  8. ax.set_xticks(ticks=pt.mid, labels=pt.mid)
  9. # 如果需要,将区间添加为柱状图的标签
  10. ax.bar_label(ax.containers[0], labels=df.bins.unique(), weight='bold')
  11. # 添加线图
  12. _ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

结合分组条形图和折线图

绘图 2

  1. fig, ax = plt.subplots(figsize=(30, 7))
  2. ax.axhline(0, color='black')
  3. ax.bar(data=pt, x='mid', height='value', width=4, alpha=0.5)
  4. ax.set_xticks(ticks=pt.mid, labels=df.bins.unique(), rotation=45)
  5. ax.bar_label(ax.containers[0], weight='bold')
  6. _ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

结合分组条形图和折线图

绘图 3

  • 柱宽为区间的宽度
  1. fig, ax = plt.subplots(figsize=(30, 7))
  2. ax.axhline(0, color='black')
  3. ax.bar(data=pt, x='mid', height='value', width=50, alpha=0.5, ec='k')
  4. ax.set_xticks(ticks=pt.mid, labels=df.bins.unique(), rotation=45)
  5. ax.bar_label(ax.containers[0], weight='bold')
  6. _ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

结合分组条形图和折线图

英文:
  • This answer explains why it's better to plot the bars with matplotlib.axes.Axes.bar instead of sns.barplot or pandas.DataFrame.bar.
    • In short, the xtick locations correspond to the actual numeric value of the label, whereas the xticks for seaborn and pandas are 0 indexed, and don't correspond to the numeric value.
  • This answer shows how to add bar labels.
  • ax2 = ax.twinx() can be used for the line plot if needed
  • Works the same if the line plot is different data.
  • Tested in python 3.11, pandas 1.5.2, matplotlib 3.6.2, seaborn 0.12.1

Imports and DataFrame

  1. import pandas as pd
  2. import seaborn as sns
  3. import matplotlib.pyplot as plt
  4. import numpy as np
  5. # test data
  6. np.random.seed(2022)
  7. df = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))
  8. # create the bins
  9. df["bins"] = pd.qcut(df.index, 20)
  10. # add a column for the mid point of the interval
  11. df['mid'] = df.bins.apply(lambda row: row.mid.round().astype(int))
  12. # pivot the dataframe to calculate the mean of each interval
  13. pt = df.pivot_table(index='mid', values='value', aggfunc='mean').reset_index()

Plot 1

  1. # create the figure
  2. fig, ax = plt.subplots(figsize=(30, 7))
  3. # add a horizontal line at y=0
  4. ax.axhline(0, color='black')
  5. # add the bar plot
  6. ax.bar(data=pt, x='mid', height='value', width=4, alpha=0.5)
  7. # set the labels on the xticks - if desired
  8. ax.set_xticks(ticks=pt.mid, labels=pt.mid)
  9. # add the intervals as labels on the bars - if desired
  10. ax.bar_label(ax.containers[0], labels=df.bins.unique(), weight='bold')
  11. # add the line plot
  12. _ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

结合分组条形图和折线图

Plot 2

  1. fig, ax = plt.subplots(figsize=(30, 7))
  2. ax.axhline(0, color='black')
  3. ax.bar(data=pt, x='mid', height='value', width=4, alpha=0.5)
  4. ax.set_xticks(ticks=pt.mid, labels=df.bins.unique(), rotation=45)
  5. ax.bar_label(ax.containers[0], weight='bold')
  6. _ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

结合分组条形图和折线图

Plot 3

  • The bar width is the width of the interval
  1. fig, ax = plt.subplots(figsize=(30, 7))
  2. ax.axhline(0, color='black')
  3. ax.bar(data=pt, x='mid', height='value', width=50, alpha=0.5, ec='k')
  4. ax.set_xticks(ticks=pt.mid, labels=df.bins.unique(), rotation=45)
  5. ax.bar_label(ax.containers[0], weight='bold')
  6. _ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

结合分组条形图和折线图

huangapple
  • 本文由 发表于 2023年2月6日 20:52:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75361554.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定