结合分组条形图和折线图

huangapple go评论64阅读模式
英文:

Combine Binned barplot with lineplot

问题

我想在同一张图上表示两个数据集,一个作为折线图,另一个作为分组条形图。我可以分别做到:

但是当我尝试将它们合并时,x轴当然会出现问题:

我也似乎无法去掉分组标签。

我该如何在同一张图上展示这两个信息呢?
英文:

I'd like to represent two datasets on the same plot, one as a line as one as a binned barplot. I can do each individually:

tobar = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))
tobar["bins"] = pd.qcut(tobar.index, 20)

bp = sns.barplot(data=tobar, x="bins", y="value")

结合分组条形图和折线图

toline = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))

lp = sns.lineplot(data=toline, x=toline.index, y="value")

结合分组条形图和折线图

But when I try to combine them, of course the x axis gets messed up:

fig, ax = plt.subplots()
ax2 = ax.twinx()
bp = sns.barplot(data=tobar, x="bins", y="value", ax=ax)
lp = sns.lineplot(data=toline, x=toline.index, y="value", ax=ax2)
bp.set(xlabel=None)

结合分组条形图和折线图

I also can't seem to get rid of the bin labels.

How can I get these two informations on the one plot?

答案1

得分: 2

  • 这个答案解释了为什么最好使用 matplotlib.axes.Axes.bar 来绘制柱状图,而不是使用 sns.barplotpandas.DataFrame.bar
    • 简而言之,xtick 的位置对应于标签的实际数值,而 seabornpandas 中的 xticks 是从 0 开始索引的,并不对应实际数值。
  • 这个答案展示了如何添加柱状图的标签。
  • 如果需要,可以对线图使用 ax2 = ax.twinx()
  • 如果线图使用不同的数据也是可以的。
  • python 3.11pandas 1.5.2matplotlib 3.6.2seaborn 0.12.1 中进行了测试

导入和数据框

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

# 测试数据
np.random.seed(2022)
df = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))

# 创建分箱
df["bins"] = pd.qcut(df.index, 20)

# 添加一个列作为区间的中点
df['mid'] = df.bins.apply(lambda row: row.mid.round().astype(int))

# 透视数据框以计算每个区间的均值
pt = df.pivot_table(index='mid', values='value', aggfunc='mean').reset_index()

绘图 1

# 创建图形
fig, ax = plt.subplots(figsize=(30, 7))

# 在 y=0 处添加水平线
ax.axhline(0, color='black')

# 添加柱状图
ax.bar(data=pt, x='mid', height='value', width=4, alpha=0.5)

# 如果需要,设置 xticks 的标签
ax.set_xticks(ticks=pt.mid, labels=pt.mid)

# 如果需要,将区间添加为柱状图的标签
ax.bar_label(ax.containers[0], labels=df.bins.unique(), weight='bold')

# 添加线图
_ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

结合分组条形图和折线图

绘图 2

fig, ax = plt.subplots(figsize=(30, 7))

ax.axhline(0, color='black')

ax.bar(data=pt, x='mid', height='value', width=4, alpha=0.5)

ax.set_xticks(ticks=pt.mid, labels=df.bins.unique(), rotation=45)

ax.bar_label(ax.containers[0], weight='bold')

_ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

结合分组条形图和折线图

绘图 3

  • 柱宽为区间的宽度
fig, ax = plt.subplots(figsize=(30, 7))

ax.axhline(0, color='black')

ax.bar(data=pt, x='mid', height='value', width=50, alpha=0.5, ec='k')

ax.set_xticks(ticks=pt.mid, labels=df.bins.unique(), rotation=45)

ax.bar_label(ax.containers[0], weight='bold')

_ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

结合分组条形图和折线图

英文:
  • This answer explains why it's better to plot the bars with matplotlib.axes.Axes.bar instead of sns.barplot or pandas.DataFrame.bar.
    • In short, the xtick locations correspond to the actual numeric value of the label, whereas the xticks for seaborn and pandas are 0 indexed, and don't correspond to the numeric value.
  • This answer shows how to add bar labels.
  • ax2 = ax.twinx() can be used for the line plot if needed
  • Works the same if the line plot is different data.
  • Tested in python 3.11, pandas 1.5.2, matplotlib 3.6.2, seaborn 0.12.1

Imports and DataFrame

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

# test data
np.random.seed(2022)
df = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))

# create the bins
df["bins"] = pd.qcut(df.index, 20)

# add a column for the mid point of the interval
df['mid'] = df.bins.apply(lambda row: row.mid.round().astype(int))

# pivot the dataframe to calculate the mean of each interval
pt = df.pivot_table(index='mid', values='value', aggfunc='mean').reset_index()

Plot 1

# create the figure
fig, ax = plt.subplots(figsize=(30, 7))

# add a horizontal line at y=0 
ax.axhline(0, color='black')

# add the bar plot
ax.bar(data=pt, x='mid', height='value', width=4, alpha=0.5)

# set the labels on the xticks - if desired
ax.set_xticks(ticks=pt.mid, labels=pt.mid)

# add the intervals as labels on the bars - if desired
ax.bar_label(ax.containers[0], labels=df.bins.unique(), weight='bold')

# add the line plot
_ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

结合分组条形图和折线图

Plot 2

fig, ax = plt.subplots(figsize=(30, 7))

ax.axhline(0, color='black')

ax.bar(data=pt, x='mid', height='value', width=4, alpha=0.5)

ax.set_xticks(ticks=pt.mid, labels=df.bins.unique(), rotation=45)

ax.bar_label(ax.containers[0], weight='bold')

_ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

结合分组条形图和折线图

Plot 3

  • The bar width is the width of the interval
fig, ax = plt.subplots(figsize=(30, 7))

ax.axhline(0, color='black')

ax.bar(data=pt, x='mid', height='value', width=50, alpha=0.5, ec='k')

ax.set_xticks(ticks=pt.mid, labels=df.bins.unique(), rotation=45)

ax.bar_label(ax.containers[0], weight='bold')

_ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

结合分组条形图和折线图

huangapple
  • 本文由 发表于 2023年2月6日 20:52:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75361554.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定