英文:
To count the occurrences of each year for each 'option' and 'Type',
问题
我有以下数据框:
d_f = pd.DataFrame({
'Type': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
'count': ['one', 'one', 'two', 'two', 'one', 'one'],
2022: [0, 0, 0.5, 1, 1, 1],
2023: [0, 0.5, 0.5, 1, 1, 1],
2024: [0.5, 0.5, 1, 1, 0, 0],
2025: [1, 0, 0.5, 0.5, 1, 1],
2026: [0, 0.5, 1, 1, 0, 0.5],
'option': [0, 1, 0, 0.5, 1, 0.5]
})
我试图根据"Type"中的值统计每个"option"的每年出现的次数。
我使用了以下代码:
table = d_f.pivot_table(index=['Type'], columns='option', aggfunc='count').fillna(0)
table
还有这个:
table = d_f.groupby(['option', 'Type'])[2022, 2023, 2024, 2025, 2026].count()
table = table.unstack(level=0).fillna(0)
但不幸的是,它们都没有返回正确的答案。任何建议将不胜感激。
答案应该类似于下面这样的表格:
英文:
I have the following data frame:
d_f = pd.DataFrame({
'Type': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
'count': ['one', 'one', 'two', 'two', 'one', 'one'],
2022: [0, 0, 0.5, 1, 1, 1],
2023: [0, 0.5, 0.5, 1, 1, 1],
2024: [0.5, 0.5, 1, 1, 0, 0],
2025: [1, 0, 0.5, 0.5, 1, 1],
2026: [0, 0.5, 1, 1, 0, 0.5],
'option': [0, 1, 0, 0.5, 1, 0.5]})
I am trying to count the occurrences of each year for each 'option' according to the values in "Type".
I used the following code:
table = d_f.pivot_table(index=['Type'], columns='option',aggfunc='count'
).fillna(0)
table
and this as well:
table = d_f.groupby(['option', 'Type'])[2022, 2023, 2024, 2025, 2026].count()
table = table.unstack(level=0).fillna(0)
But unfortunately, both of them did not return the correct answer. Any suggestions would be very appreciated.
The answer should be something like:
答案1
得分: 4
IIUC,您想要类似以下的内容:
(d_f.drop(columns='option')
.melt(['Type', 'count'], var_name='year', value_name='option')
.groupby(['Type', 'year', 'option'])['option'].count()
.unstack('year', fill_value=0).unstack('option', fill_value=0)
)
或者:
df2 = (d_f.drop(columns='option')
.melt(['Type', 'count'], var_name='year', value_name='option')
)
out = pd.crosstab(df2['Type'], [df2['year'], df2['option']])
输出:
year 2022 2023 2024 2025 2026
option 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
Type
bar 0 0 3 0 0 3 2 0 1 0 1 2 1 1 1
foo 2 1 0 1 2 0 0 2 1 1 1 1 1 1 1
英文:
IIUC, you want something like:
(d_f.drop(columns='option')
.melt(['Type', 'count'], var_name='year', value_name='option')
.groupby(['Type', 'year', 'option'])['option'].count()
.unstack('year', fill_value=0).unstack('option', fill_value=0)
)
Or:
df2 = (d_f.drop(columns='option')
.melt(['Type', 'count'], var_name='year', value_name='option')
)
out = pd.crosstab(df2['Type'], [df2['year'], df2['option']])
Output:
year 2022 2023 2024 2025 2026
option 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
Type
bar 0 0 3 0 0 3 2 0 1 0 1 2 1 1 1
foo 2 1 0 1 2 0 0 2 1 1 1 1 1 1 1
答案2
得分: 1
提供的数据框如下所示:
import pandas as pd
d_f = pd.DataFrame({
'Type': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
'count': ['one', 'one', 'two', 'two', 'one', 'one'],
2022: [0, 0, 0.5, 1, 1, 1],
2023: [0, 0.5, 0.5, 1, 1, 1],
2024: [0.5, 0.5, 1, 1, 0, 0],
2025: [1, 0, 0.5, 0.5, 1, 1],
2026: [0, 0.5, 1, 1, 0, 0.5],
'option': [0, 1, 0, 0.5, 1, 0.5]})
table = d_f.groupby(['option', 'Type']).nunique().drop('count', axis=1)
table = table.unstack(level=0).fillna('')
print(d_f)
对于表格的可视化,我使用了fillna('')
。
英文:
Provided dataframe
d_f
Type count 2022 2023 2024 2025 2026 option
0 foo one 0.0 0.0 0.5 1.0 0.0 0.0
1 foo one 0.0 0.5 0.5 0.0 0.5 1.0
2 foo two 0.5 0.5 1.0 0.5 1.0 0.0
3 bar two 1.0 1.0 1.0 0.5 1.0 0.5
4 bar one 1.0 1.0 0.0 1.0 0.0 1.0
5 bar one 1.0 1.0 0.0 1.0 0.5 0.5
import pandas as pd
d_f = pd.DataFrame({
'Type': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
'count': ['one', 'one', 'two', 'two', 'one', 'one'],
2022: [0, 0, 0.5, 1, 1, 1],
2023: [0, 0.5, 0.5, 1, 1, 1],
2024: [0.5, 0.5, 1, 1, 0, 0],
2025: [1, 0, 0.5, 0.5, 1, 1],
2026: [0, 0.5, 1, 1, 0, 0.5],
'option': [0, 1, 0, 0.5, 1, 0.5]})
table = d_f.groupby(['option', 'Type']).nunique().drop('count', axis=1)
table = table.unstack(level=0).fillna('')
print(d_f)
table
2022 2023 2024 ... 2025 2026
option 0.0 0.5 1.0 0.0 0.5 1.0 0.0 ... 1.0 0.0 0.5 1.0 0.0 0.5 1.0
Type ...
bar 1.0 1.0 1.0 1.0 ... 1.0 2.0 1.0 2.0 1.0
foo 2.0 1.0 2.0 1.0 2.0 ... 1.0 2.0 1.0 2.0 1.0
[2 rows x 15 columns]
For visibility I used fillna('')
答案3
得分: 1
使用 concat
和简单的 groupby
+ value_counts
pd.concat([d_f.groupby('Type')[year].value_counts()
for year in [2022, 2023, 2024, 2025, 2026]], axis=1).fillna(0)
结果如下:
2022 2023 2024 2025 2026
Type
bar 0.0 0.0 0.0 2.0 0.0 1
0.5 0.0 0.0 0.0 1.0 1
1.0 3.0 3.0 1.0 2.0 1
foo 0.0 2.0 1.0 0.0 1.0 1
0.5 1.0 2.0 2.0 1.0 1
1.0 0.0 0.0 1.0 1.0 1
英文:
Use concat
and a simple groupby
+ value_counts
>>> pd.concat([d_f.groupby('Type')[year].value_counts()
for year in [2022, 2023, 2024, 2025, 2026]], axis=1).fillna(0)
2022 2023 2024 2025 2026
Type
bar 0.0 0.0 0.0 2.0 0.0 1
0.5 0.0 0.0 0.0 1.0 1
1.0 3.0 3.0 1.0 2.0 1
foo 0.0 2.0 1.0 0.0 1.0 1
0.5 1.0 2.0 2.0 1.0 1
1.0 0.0 0.0 1.0 1.0 1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论