英文:
plotting two DataFrame.value_counts() in a single histogram
问题
我想在一个直方图中绘制两个不同的数据框(每个数据框中只有一列)。
d1 = {'Size': ['Big', 'Big', 'Normal', 'Big']}
df1 = pd.DataFrame(data=d1)
d2 = {'Size': ['Small', 'Normal', 'Normal', 'Normal', 'Small', 'Big', 'Big', 'Normal', 'Big']}
df2 = pd.DataFrame(data=d2)
# 在一个直方图中绘制
df1['Size'].value_counts().plot.bar(label="df1")
df2['Size'].value_counts().plot.bar(label="df2", alpha=0.2, color='purple')
plt.legend(loc='upper right')
plt.show()
问题是直方图的x轴只对df2正确。对于df1,应该有3个'big'和1个'normal'的值:
我尝试了多种生成图表的方法,这是我最接近我想要的东西的方式,即将两个数据框绘制在同一个直方图中,使用不同的颜色。
理想情况下,它们应该并排,但我没有找到如何实现,'stacked=False'在这里不起作用。
任何帮助都受欢迎。谢谢!
英文:
I want to plot in a single histogram two different dataframes (only one column from each).
d1 = {'Size': ['Big', 'Big', 'Normal','Big']}
df1 = pd.DataFrame(data=d1)
d2 = {'Size': ['Small','Normal','Normal','Normal', 'Small', 'Big', 'Big', 'Normal','Big']}
df2 = pd.DataFrame(data=d2)
#Plotting in one histogram
df1['Size'].value_counts().plot.bar(label = "df1")
df2['Size'].value_counts().plot.bar(label = "df2", alpha = 0.2,color='purple')
plt.legend(loc='upper right')
plt.show()
The issue is that in the x-axis of the histogram is only correct for df2. For df1 there should be 3 values of 'big' and 1 value of 'normal':
I have tried multiple ways of generating the plot and this is the closest I got to what I want, which is both dataframes in the same histogram, with different colors.
Ideally they would be side to side, but I didn't manage to find how, and 'stacked = False' doesn't work here.
Any help is welcome. Thanks!
答案1
得分: 0
你可以在显式的X值上进行reindex
:
x = ['Small', 'Normal', 'Big']
df1['Size'].value_counts().reindex(x).plot.bar(label="df1")
df2['Size'].value_counts().reindex(x).plot.bar(label="df2", alpha=0.2, color='purple')
输出:
另一个选项:
(pd.concat({'df1': df1, 'df2': df2})['Size']
.groupby(level=0).value_counts()
.unstack(0)
.plot.bar()
)
输出:
英文:
You can reindex
on explicit X-values:
x = ['Small', 'Normal', 'Big']
df1['Size'].value_counts().reindex(x).plot.bar(label = "df1")
df2['Size'].value_counts().reindex(x).plot.bar(label = "df2", alpha = 0.2,color='purple')
Output:
Another option:
(pd.concat({'df1': df1, 'df2': df2})['Size']
.groupby(level=0).value_counts()
.unstack(0)
.plot.bar()
)
Output:
答案2
得分: 0
你还可以尝试使用plotly
,它可以生成交互式图表。这样我们可以悬停在图表上查看精确的数据值和其他信息。
import plotly.graph_objects as go
classes = ['Small', 'Normal', 'Large']
fig = go.Figure(data=[
go.Bar(name='df1', x=classes, y=df1.value_counts()),
go.Bar(name='df2', x=classes, y=df2.value_counts())
])
# 更改柱状图模式
fig.update_layout(barmode='group')
fig.show()
输出:
英文:
You can also try plotly
which produces interactive graphs. That is we can hover over the plots and see exact data values and other information.
import plotly.graph_objects as go
classes=['Small', 'Normal', 'Large']
#classes=df2.Size.unique() (better to use this)
fig = go.Figure(data=[
go.Bar(name='df1', x=classes, y=df1.value_counts()),
go.Bar(name='df2', x=classes, y=df2.value_counts())
])
# Change the bar mode
fig.update_layout(barmode='group')
fig.show()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论