英文:
How to create stacked bar chart with given dataframe shape?
问题
我有一个数据框,并想要创建一个堆叠条形图,其中x轴上有日期,y轴上有数量。这是当前的数据框:
日期 | 产品组 | 数量
2021-10-01 | A | 10
2021-10-01 | C | 10
2021-10-01 | Z | 80
2021-11-11 | A | 13
2021-12-12 | B | 5
我试图使用matplotlib或seaborn来实现以下输出:
- 数量在x轴上(% 堆叠)
- 日期在y轴上
- 将数量堆叠到每个唯一日期和产品组选项上。也就是说,对于日期10-01,我们有一个堆叠,包括A、C、Z和它们各自的数量(相对于彼此,即A=0.1,C=0.1,Z=0.8)
在这里最佳的方法是什么?任何建议都将不胜感激。谢谢。
英文:
I have a dataframe and would like to create a stacked bar chart by having date on the x-axis and quantity on the y-axis. This is the current dataframe:
date | product_group | quantity
2021-10-01 | A | 10
2021-10-01 | C | 10
2021-10-01 | Z | 80
2021-11-11 | A | 13
2021-12-12 | B | 5..
I am trying to get to this output using either matplotlib or seaborn where I have:
- quantity on the x-axis (% stack)
- date on the y-axis
- have quantity stacked for each unique date & product group option. I.e. for date 10-01, we have a stack with A,C,Z and their respective quantities (relative to each other, i.e. A=0.1, C=0.1, Z=0.8)
What is the best approach here? Any advise is appreciated. Thanks
答案1
得分: 0
这是一个使用 histplot
的一行代码:
sns.histplot(df, y='date', weights='quantity', hue='product_group', multiple='stack')
编辑:如果你希望所有的条形图具有相同的长度,请将 multiple
设置为 fill
:
sns.histplot(df, y='date', weights='quantity', hue='product_group', multiple='fill')
英文:
It's a one-liner with histplot
:
sns.histplot(df, y='date', weights='quantity', hue='product_group', multiple='stack')
Edit: if you want all bars to have the same length, set multiple
to fill
:
sns.histplot(df, y='date', weights='quantity', hue='product_group', multiple='fill')
Output:
答案2
得分: 0
如果你不想使用seaborn而是matplotlib,你可以这样做:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'date': ['2021-10-01']*3 + ['2021-11-11', '2021-11-12'],
'product_group': ['A', 'C', 'Z', 'A', 'B'],
'quantity': [10, 10, 80, 13, 5]
})
# 将值透视为堆叠图的列
df = df.pivot(index='date', columns='product_group', values='quantity')
# 创建百分比
df.div(df.sum(axis=1), axis=0).plot.barh(stacked=True)
英文:
If you don't want to use seaborn but rather matplotlib, you could do:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'date': ['2021-10-01']*3 + ['2021-11-11', '2021-11-12'],
'product_group': ['A', 'C', 'Z', 'A', 'B'],
'quantity': [10,10,80,13,5]
})
# pivot values into columns for stacked plot
df = df.pivot(index='date', columns='product_group', values='quantity')
# create percentages
df.div(df.sum(axis=1),axis=0).plot.barh(stacked=True)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论