2023年2月18日 01:04:35go评论95阅读模式

英文:

To count the occurrences of each year for each 'option' and 'Type',

问题

我有以下数据框：

d_f = pd.DataFrame({
    'Type': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
    'count': ['one', 'one', 'two', 'two', 'one', 'one'],
    2022: [0, 0, 0.5, 1, 1, 1],
    2023: [0, 0.5, 0.5, 1, 1, 1],
    2024: [0.5, 0.5, 1, 1, 0, 0],
    2025: [1, 0, 0.5, 0.5, 1, 1],
    2026: [0, 0.5, 1, 1, 0, 0.5],
    'option': [0, 1, 0, 0.5, 1, 0.5]
})

我试图根据"Type"中的值统计每个"option"的每年出现的次数。

我使用了以下代码：

table = d_f.pivot_table(index=['Type'], columns='option', aggfunc='count').fillna(0)
table

还有这个：

table = d_f.groupby(['option', 'Type'])[2022, 2023, 2024, 2025, 2026].count()
table = table.unstack(level=0).fillna(0)

但不幸的是，它们都没有返回正确的答案。任何建议将不胜感激。

答案应该类似于下面这样的表格：

英文:

I have the following data frame:

d_f = pd.DataFrame({
&#39;Type&#39;: [&#39;foo&#39;, &#39;foo&#39;, &#39;foo&#39;, &#39;bar&#39;, &#39;bar&#39;, &#39;bar&#39;],
&#39;count&#39;: [&#39;one&#39;, &#39;one&#39;, &#39;two&#39;, &#39;two&#39;, &#39;one&#39;, &#39;one&#39;],
2022: [0, 0, 0.5, 1, 1, 1],
2023: [0, 0.5, 0.5, 1, 1, 1],
2024: [0.5, 0.5, 1, 1, 0, 0],
2025: [1, 0, 0.5, 0.5, 1, 1],
2026: [0, 0.5, 1, 1, 0, 0.5],
&#39;option&#39;: [0, 1, 0, 0.5, 1, 0.5]})

I am trying to count the occurrences of each year for each 'option' according to the values in "Type".

I used the following code:

table = d_f.pivot_table(index=[&#39;Type&#39;], columns=&#39;option&#39;,aggfunc=&#39;count&#39;
                                            ).fillna(0)
table

and this as well:

table = d_f.groupby([&#39;option&#39;, &#39;Type&#39;])[2022, 2023, 2024, 2025, 2026].count()
table = table.unstack(level=0).fillna(0)

But unfortunately, both of them did not return the correct answer. Any suggestions would be very appreciated.

The answer should be something like:

答案1

得分: 4

IIUC，您想要类似以下的内容：

(d_f.drop(columns='option')
    .melt(['Type', 'count'], var_name='year', value_name='option')
    .groupby(['Type', 'year', 'option'])['option'].count()
    .unstack('year', fill_value=0).unstack('option', fill_value=0)
)

或者：

df2 = (d_f.drop(columns='option')
          .melt(['Type', 'count'], var_name='year', value_name='option')
      )
out = pd.crosstab(df2['Type'], [df2['year'], df2['option']])

输出：

year   2022         2023         2024         2025         2026        
option  0.0 0.5 1.0  0.0 0.5 1.0  0.0 0.5 1.0  0.0 0.5 1.0  0.0 0.5 1.0
Type                                                                   
bar       0   0   3    0   0   3    2   0   1    0   1   2    1   1   1
foo       2   1   0    1   2   0    0   2   1    1   1   1    1   1   1

英文:

IIUC, you want something like:

(d_f.drop(columns=&#39;option&#39;)
    .melt([&#39;Type&#39;, &#39;count&#39;], var_name=&#39;year&#39;, value_name=&#39;option&#39;)
    .groupby([&#39;Type&#39;, &#39;year&#39;, &#39;option&#39;])[&#39;option&#39;].count()
    .unstack(&#39;year&#39;, fill_value=0).unstack(&#39;option&#39;, fill_value=0)
)

Or:

df2 = (d_f.drop(columns=&#39;option&#39;)
          .melt([&#39;Type&#39;, &#39;count&#39;], var_name=&#39;year&#39;, value_name=&#39;option&#39;)
      )
out = pd.crosstab(df2[&#39;Type&#39;], [df2[&#39;year&#39;], df2[&#39;option&#39;]])

Output:

year   2022         2023         2024         2025         2026        
option  0.0 0.5 1.0  0.0 0.5 1.0  0.0 0.5 1.0  0.0 0.5 1.0  0.0 0.5 1.0
Type                                                                   
bar       0   0   3    0   0   3    2   0   1    0   1   2    1   1   1
foo       2   1   0    1   2   0    0   2   1    1   1   1    1   1   1

答案2

得分: 1

提供的数据框如下所示：

import pandas as pd
d_f = pd.DataFrame({
'Type': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
'count': ['one', 'one', 'two', 'two', 'one', 'one'],
2022: [0, 0, 0.5, 1, 1, 1],
2023: [0, 0.5, 0.5, 1, 1, 1],
2024: [0.5, 0.5, 1, 1, 0, 0],
2025: [1, 0, 0.5, 0.5, 1, 1],
2026: [0, 0.5, 1, 1, 0, 0.5],
'option': [0, 1, 0, 0.5, 1, 0.5]})
table = d_f.groupby(['option', 'Type']).nunique().drop('count', axis=1)
table = table.unstack(level=0).fillna('')
print(d_f)

对于表格的可视化，我使用了fillna('')。

英文:

Provided dataframe

d_f
  Type count  2022  2023  2024  2025  2026  option
0  foo   one   0.0   0.0   0.5   1.0   0.0     0.0
1  foo   one   0.0   0.5   0.5   0.0   0.5     1.0
2  foo   two   0.5   0.5   1.0   0.5   1.0     0.0
3  bar   two   1.0   1.0   1.0   0.5   1.0     0.5
4  bar   one   1.0   1.0   0.0   1.0   0.0     1.0
5  bar   one   1.0   1.0   0.0   1.0   0.5     0.5

import pandas as pd
d_f = pd.DataFrame({
&#39;Type&#39;: [&#39;foo&#39;, &#39;foo&#39;, &#39;foo&#39;, &#39;bar&#39;, &#39;bar&#39;, &#39;bar&#39;],
&#39;count&#39;: [&#39;one&#39;, &#39;one&#39;, &#39;two&#39;, &#39;two&#39;, &#39;one&#39;, &#39;one&#39;],
2022: [0, 0, 0.5, 1, 1, 1],
2023: [0, 0.5, 0.5, 1, 1, 1],
2024: [0.5, 0.5, 1, 1, 0, 0],
2025: [1, 0, 0.5, 0.5, 1, 1],
2026: [0, 0.5, 1, 1, 0, 0.5],
&#39;option&#39;: [0, 1, 0, 0.5, 1, 0.5]})
table = d_f.groupby([&#39;option&#39;, &#39;Type&#39;]).nunique().drop(&#39;count&#39;, axis=1)
table = table.unstack(level=0).fillna(&#39;&#39;)
print(d_f)

table
       2022           2023           2024  ...      2025           2026          
option  0.0  0.5  1.0  0.0  0.5  1.0  0.0  ...  1.0  0.0  0.5  1.0  0.0  0.5  1.0
Type                                       ...                                   
bar          1.0  1.0       1.0  1.0       ...  1.0       2.0  1.0       2.0  1.0
foo     2.0       1.0  2.0       1.0  2.0  ...  1.0  2.0       1.0  2.0       1.0
[2 rows x 15 columns]

For visibility I used fillna('')

答案3

得分: 1

使用 concat 和简单的 groupby + value_counts

pd.concat([d_f.groupby('Type')[year].value_counts()
           for year in [2022, 2023, 2024, 2025, 2026]], axis=1).fillna(0)

结果如下：

              2022  2023  2024  2025  2026
    Type                                  
    bar  0.0   0.0   0.0   2.0   0.0     1
         0.5   0.0   0.0   0.0   1.0     1
         1.0   3.0   3.0   1.0   2.0     1
    foo  0.0   2.0   1.0   0.0   1.0     1
         0.5   1.0   2.0   2.0   1.0     1
         1.0   0.0   0.0   1.0   1.0     1

英文:

Use concat and a simple groupby + value_counts

&gt;&gt;&gt; pd.concat([d_f.groupby(&#39;Type&#39;)[year].value_counts()
               for year in [2022, 2023, 2024, 2025, 2026]], axis=1).fillna(0)

          2022  2023  2024  2025  2026
Type                                  
bar  0.0   0.0   0.0   2.0   0.0     1
     0.5   0.0   0.0   0.0   1.0     1
     1.0   3.0   3.0   1.0   2.0     1
foo  0.0   2.0   1.0   0.0   1.0     1
     0.5   1.0   2.0   2.0   1.0     1
     1.0   0.0   0.0   1.0   1.0     1

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

统计每个“option”和“Type”中每年发生的次数。

问题

答案1

答案2

答案3

TensorFlow图像分割的文件夹结构？

AttributeError: ‘super’对象没有属性’init’

获取由Flask应用程序中的Celery创建的Redis中任务的所有键列表。

AttributeError: ‘str’对象没有属性’_execute_on_connection’

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。