问题

以下是您提供的内容的翻译部分：

问题

如何计算每年的number_x的百分比？意思是：

直接除法不起作用，因为原始数据框中的年份无法设置为索引，因为它不是唯一的。

现在我正在执行以下操作，但它效率不高，我相信有更好的方法。

df= pd.merge(df, df.groupby('year').sum(), left_on='year',right_index=True)
df['%'] = round((df['number_x'] / df['number_y'])*100 , 2)
df = df.drop('number_y', axis=1)

谢谢！

英文:

Easier to explain with an example, say I have an example dataframe here with year, cc_rating and number_x.

df = pd.DataFrame({&quot;year&quot;:{&quot;0&quot;:2005,&quot;1&quot;:2005,&quot;2&quot;:2005,&quot;3&quot;:2006,&quot;4&quot;:2006,&quot;5&quot;:2006,&quot;6&quot;:2007,&quot;7&quot;:2007,&quot;8&quot;:2007},&quot;cc_rating&quot;:{&quot;0&quot;:&quot;2&quot;,&quot;1&quot;:&quot;2a&quot;,&quot;2&quot;:&quot;2b&quot;,&quot;3&quot;:&quot;2&quot;,&quot;4&quot;:&quot;2a&quot;,&quot;5&quot;:&quot;2b&quot;,&quot;6&quot;:&quot;2&quot;,&quot;7&quot;:&quot;2a&quot;,&quot;8&quot;:&quot;2b&quot;},&quot;number_x&quot;:{&quot;0&quot;:9368,&quot;1&quot;:21643,&quot;2&quot;:107577,&quot;3&quot;:10069,&quot;4&quot;:21486,&quot;5&quot;:110326,&quot;6&quot;:10834,&quot;7&quot;:21566,&quot;8&quot;:111082}})

df 

year	cc_rating	number_x
0	2005	2	9368
1	2005	2a	21643
2	2005	2b	107577
3	2006	2	10069
4	2006	2a	21486
5	2006	2b	110326
6	2007	2	10834
7	2007	2a	21566
8	2007	2b	111082

Problem

How can I get the % of number_x per year? Meaning:

Straight division wont work as year cant be set as the index in the original df as it is not unique.

Right now I'm doing the following but its quite inefficient and im sure theres a better way.

df= pd.merge(df, df.groupby(&#39;year&#39;).sum(), left_on=&#39;year&#39;,right_index=True)
df[&#39;%&#39;] = round((df[&#39;number_x&#39;] / df[&#39;number_y&#39;])*100 , 2)
df = df.drop(&#39;number_y&#39;, axis=1)

Thanks!

答案1

得分: 0

以下是已翻译好的部分：

可能的解决方案：

（df.assign(
    perc = (100*df.number_x.div(df.groupby('year').number_x.transform('sum')))
    .round(2))))

输出：

   year cc_rating  number_x   perc
0  2005         2      9368   6.76
1  2005        2a     21643  15.62
2  2005        2b    107577  77.62
3  2006         2     10069   7.10
4  2006        2a     21486  15.14
5  2006        2b    110326  77.76
6  2007         2     10834   7.55
7  2007        2a     21566  15.03
8  2007        2b    111082  77.42

英文:

A possible solution:

(df.assign(
    perc = (100*df.number_x.div(df.groupby(&#39;year&#39;).number_x.transform(&#39;sum&#39;)))
    .round(2)))

Output:

   year cc_rating  number_x   perc
0  2005         2      9368   6.76
1  2005        2a     21643  15.62
2  2005        2b    107577  77.62
3  2006         2     10069   7.10
4  2006        2a     21486  15.14
5  2006        2b    110326  77.76
6  2007         2     10834   7.55
7  2007        2a     21566  15.03
8  2007        2b    111082  77.42

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将pandas列按groupby数据框分割的有效方法

问题

答案1

Pandas：使用条件和运算符（或、与）的Lambda函数。

continuous data, Y response not support in the cross_val_score() binary|multiclass for IterativeImputer for BayesianRidge

如何基于ElasticSearch（Python）中两个子聚合指标的比较来筛选存储桶？

I have a balanced dataset, after I split it to train & test set, the test set is imbalance, what is the reason?

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论