统计每个“option”和“Type”中每年发生的次数。

huangapple go评论95阅读模式
英文:

To count the occurrences of each year for each 'option' and 'Type',

问题

我有以下数据框:

  1. d_f = pd.DataFrame({
  2. 'Type': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
  3. 'count': ['one', 'one', 'two', 'two', 'one', 'one'],
  4. 2022: [0, 0, 0.5, 1, 1, 1],
  5. 2023: [0, 0.5, 0.5, 1, 1, 1],
  6. 2024: [0.5, 0.5, 1, 1, 0, 0],
  7. 2025: [1, 0, 0.5, 0.5, 1, 1],
  8. 2026: [0, 0.5, 1, 1, 0, 0.5],
  9. 'option': [0, 1, 0, 0.5, 1, 0.5]
  10. })

我试图根据"Type"中的值统计每个"option"的每年出现的次数。

我使用了以下代码:

  1. table = d_f.pivot_table(index=['Type'], columns='option', aggfunc='count').fillna(0)
  2. table

还有这个:

  1. table = d_f.groupby(['option', 'Type'])[2022, 2023, 2024, 2025, 2026].count()
  2. table = table.unstack(level=0).fillna(0)

但不幸的是,它们都没有返回正确的答案。任何建议将不胜感激。

答案应该类似于下面这样的表格:

统计每个“option”和“Type”中每年发生的次数。

英文:

I have the following data frame:

  1. d_f = pd.DataFrame({
  2. 'Type': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
  3. 'count': ['one', 'one', 'two', 'two', 'one', 'one'],
  4. 2022: [0, 0, 0.5, 1, 1, 1],
  5. 2023: [0, 0.5, 0.5, 1, 1, 1],
  6. 2024: [0.5, 0.5, 1, 1, 0, 0],
  7. 2025: [1, 0, 0.5, 0.5, 1, 1],
  8. 2026: [0, 0.5, 1, 1, 0, 0.5],
  9. 'option': [0, 1, 0, 0.5, 1, 0.5]})

I am trying to count the occurrences of each year for each 'option' according to the values in "Type".

I used the following code:

  1. table = d_f.pivot_table(index=['Type'], columns='option',aggfunc='count'
  2. ).fillna(0)
  3. table

and this as well:

  1. table = d_f.groupby(['option', 'Type'])[2022, 2023, 2024, 2025, 2026].count()
  2. table = table.unstack(level=0).fillna(0)

But unfortunately, both of them did not return the correct answer. Any suggestions would be very appreciated.

The answer should be something like:

统计每个“option”和“Type”中每年发生的次数。

答案1

得分: 4

IIUC,您想要类似以下的内容:

  1. (d_f.drop(columns='option')
  2. .melt(['Type', 'count'], var_name='year', value_name='option')
  3. .groupby(['Type', 'year', 'option'])['option'].count()
  4. .unstack('year', fill_value=0).unstack('option', fill_value=0)
  5. )

或者:

  1. df2 = (d_f.drop(columns='option')
  2. .melt(['Type', 'count'], var_name='year', value_name='option')
  3. )
  4. out = pd.crosstab(df2['Type'], [df2['year'], df2['option']])

输出:

  1. year 2022 2023 2024 2025 2026
  2. option 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
  3. Type
  4. bar 0 0 3 0 0 3 2 0 1 0 1 2 1 1 1
  5. foo 2 1 0 1 2 0 0 2 1 1 1 1 1 1 1
英文:

IIUC, you want something like:

  1. (d_f.drop(columns='option')
  2. .melt(['Type', 'count'], var_name='year', value_name='option')
  3. .groupby(['Type', 'year', 'option'])['option'].count()
  4. .unstack('year', fill_value=0).unstack('option', fill_value=0)
  5. )

Or:

  1. df2 = (d_f.drop(columns='option')
  2. .melt(['Type', 'count'], var_name='year', value_name='option')
  3. )
  4. out = pd.crosstab(df2['Type'], [df2['year'], df2['option']])

Output:

  1. year 2022 2023 2024 2025 2026
  2. option 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
  3. Type
  4. bar 0 0 3 0 0 3 2 0 1 0 1 2 1 1 1
  5. foo 2 1 0 1 2 0 0 2 1 1 1 1 1 1 1

答案2

得分: 1

提供的数据框如下所示:

  1. import pandas as pd
  2. d_f = pd.DataFrame({
  3. 'Type': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
  4. 'count': ['one', 'one', 'two', 'two', 'one', 'one'],
  5. 2022: [0, 0, 0.5, 1, 1, 1],
  6. 2023: [0, 0.5, 0.5, 1, 1, 1],
  7. 2024: [0.5, 0.5, 1, 1, 0, 0],
  8. 2025: [1, 0, 0.5, 0.5, 1, 1],
  9. 2026: [0, 0.5, 1, 1, 0, 0.5],
  10. 'option': [0, 1, 0, 0.5, 1, 0.5]})
  11. table = d_f.groupby(['option', 'Type']).nunique().drop('count', axis=1)
  12. table = table.unstack(level=0).fillna('')
  13. print(d_f)

对于表格的可视化,我使用了fillna('')

英文:

Provided dataframe

  1. d_f
  2. Type count 2022 2023 2024 2025 2026 option
  3. 0 foo one 0.0 0.0 0.5 1.0 0.0 0.0
  4. 1 foo one 0.0 0.5 0.5 0.0 0.5 1.0
  5. 2 foo two 0.5 0.5 1.0 0.5 1.0 0.0
  6. 3 bar two 1.0 1.0 1.0 0.5 1.0 0.5
  7. 4 bar one 1.0 1.0 0.0 1.0 0.0 1.0
  8. 5 bar one 1.0 1.0 0.0 1.0 0.5 0.5
  1. import pandas as pd
  2. d_f = pd.DataFrame({
  3. 'Type': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
  4. 'count': ['one', 'one', 'two', 'two', 'one', 'one'],
  5. 2022: [0, 0, 0.5, 1, 1, 1],
  6. 2023: [0, 0.5, 0.5, 1, 1, 1],
  7. 2024: [0.5, 0.5, 1, 1, 0, 0],
  8. 2025: [1, 0, 0.5, 0.5, 1, 1],
  9. 2026: [0, 0.5, 1, 1, 0, 0.5],
  10. 'option': [0, 1, 0, 0.5, 1, 0.5]})
  11. table = d_f.groupby(['option', 'Type']).nunique().drop('count', axis=1)
  12. table = table.unstack(level=0).fillna('')
  13. print(d_f)
  1. table
  2. 2022 2023 2024 ... 2025 2026
  3. option 0.0 0.5 1.0 0.0 0.5 1.0 0.0 ... 1.0 0.0 0.5 1.0 0.0 0.5 1.0
  4. Type ...
  5. bar 1.0 1.0 1.0 1.0 ... 1.0 2.0 1.0 2.0 1.0
  6. foo 2.0 1.0 2.0 1.0 2.0 ... 1.0 2.0 1.0 2.0 1.0
  7. [2 rows x 15 columns]

For visibility I used fillna('')

答案3

得分: 1

使用 concat 和简单的 groupby + value_counts

  1. pd.concat([d_f.groupby('Type')[year].value_counts()
  2. for year in [2022, 2023, 2024, 2025, 2026]], axis=1).fillna(0)

结果如下:

  1. 2022 2023 2024 2025 2026
  2. Type
  3. bar 0.0 0.0 0.0 2.0 0.0 1
  4. 0.5 0.0 0.0 0.0 1.0 1
  5. 1.0 3.0 3.0 1.0 2.0 1
  6. foo 0.0 2.0 1.0 0.0 1.0 1
  7. 0.5 1.0 2.0 2.0 1.0 1
  8. 1.0 0.0 0.0 1.0 1.0 1
英文:

Use concat and a simple groupby + value_counts

  1. >>> pd.concat([d_f.groupby('Type')[year].value_counts()
  2. for year in [2022, 2023, 2024, 2025, 2026]], axis=1).fillna(0)

  1. 2022 2023 2024 2025 2026
  2. Type
  3. bar 0.0 0.0 0.0 2.0 0.0 1
  4. 0.5 0.0 0.0 0.0 1.0 1
  5. 1.0 3.0 3.0 1.0 2.0 1
  6. foo 0.0 2.0 1.0 0.0 1.0 1
  7. 0.5 1.0 2.0 2.0 1.0 1
  8. 1.0 0.0 0.0 1.0 1.0 1

huangapple
  • 本文由 发表于 2023年2月18日 01:04:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/75487246.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定