在单个直方图中绘制两个 DataFrame.value_counts()。

huangapple go评论75阅读模式
英文:

plotting two DataFrame.value_counts() in a single histogram

问题

我想在一个直方图中绘制两个不同的数据框(每个数据框中只有一列)。

d1 = {'Size': ['Big', 'Big', 'Normal', 'Big']}
df1 = pd.DataFrame(data=d1)

d2 = {'Size': ['Small', 'Normal', 'Normal', 'Normal', 'Small', 'Big', 'Big', 'Normal', 'Big']}
df2 = pd.DataFrame(data=d2)

# 在一个直方图中绘制
df1['Size'].value_counts().plot.bar(label="df1")
df2['Size'].value_counts().plot.bar(label="df2", alpha=0.2, color='purple')

plt.legend(loc='upper right')
plt.show()

问题是直方图的x轴只对df2正确。对于df1,应该有3个'big'和1个'normal'的值:

df1和df2的直方图

我尝试了多种生成图表的方法,这是我最接近我想要的东西的方式,即将两个数据框绘制在同一个直方图中,使用不同的颜色。

理想情况下,它们应该并排,但我没有找到如何实现,'stacked=False'在这里不起作用。

任何帮助都受欢迎。谢谢!

英文:

I want to plot in a single histogram two different dataframes (only one column from each).

d1 = {'Size': ['Big', 'Big', 'Normal','Big']}
df1 = pd.DataFrame(data=d1)

d2 = {'Size': ['Small','Normal','Normal','Normal', 'Small', 'Big', 'Big', 'Normal','Big']}
df2 = pd.DataFrame(data=d2)

#Plotting in one histogram
df1['Size'].value_counts().plot.bar(label = "df1")
df2['Size'].value_counts().plot.bar(label = "df2", alpha = 0.2,color='purple')

plt.legend(loc='upper right')
plt.show()

The issue is that in the x-axis of the histogram is only correct for df2. For df1 there should be 3 values of 'big' and 1 value of 'normal':

histogram of df1 and df2.

I have tried multiple ways of generating the plot and this is the closest I got to what I want, which is both dataframes in the same histogram, with different colors.

Ideally they would be side to side, but I didn't manage to find how, and 'stacked = False' doesn't work here.

Any help is welcome. Thanks!

答案1

得分: 0

你可以在显式的X值上进行reindex

x = ['Small', 'Normal', 'Big']
df1['Size'].value_counts().reindex(x).plot.bar(label="df1")
df2['Size'].value_counts().reindex(x).plot.bar(label="df2", alpha=0.2, color='purple')

输出:

在单个直方图中绘制两个 DataFrame.value_counts()。

另一个选项:

(pd.concat({'df1': df1, 'df2': df2})['Size']
   .groupby(level=0).value_counts()
   .unstack(0)
   .plot.bar()
)

输出:

在单个直方图中绘制两个 DataFrame.value_counts()。

英文:

You can reindex on explicit X-values:

x = ['Small', 'Normal', 'Big']
df1['Size'].value_counts().reindex(x).plot.bar(label = "df1")
df2['Size'].value_counts().reindex(x).plot.bar(label = "df2", alpha = 0.2,color='purple')

Output:

在单个直方图中绘制两个 DataFrame.value_counts()。

Another option:

(pd.concat({'df1': df1, 'df2': df2})['Size']
   .groupby(level=0).value_counts()
   .unstack(0)
   .plot.bar()
)

Output:

在单个直方图中绘制两个 DataFrame.value_counts()。

答案2

得分: 0

你还可以尝试使用plotly,它可以生成交互式图表。这样我们可以悬停在图表上查看精确的数据值和其他信息。

import plotly.graph_objects as go
classes = ['Small', 'Normal', 'Large']

fig = go.Figure(data=[
    go.Bar(name='df1', x=classes, y=df1.value_counts()),
    go.Bar(name='df2', x=classes, y=df2.value_counts())
])
# 更改柱状图模式
fig.update_layout(barmode='group')
fig.show()

输出:
在单个直方图中绘制两个 DataFrame.value_counts()。

英文:

You can also try plotly which produces interactive graphs. That is we can hover over the plots and see exact data values and other information.

import plotly.graph_objects as go
classes=['Small', 'Normal', 'Large']
#classes=df2.Size.unique()  (better to use this)

fig = go.Figure(data=[
    go.Bar(name='df1', x=classes, y=df1.value_counts()),
    go.Bar(name='df2', x=classes, y=df2.value_counts())
])
# Change the bar mode
fig.update_layout(barmode='group')
fig.show()

Output:
在单个直方图中绘制两个 DataFrame.value_counts()。

huangapple
  • 本文由 发表于 2023年2月7日 03:05:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/75365547.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定