2020年1月4日 00:05:38go评论106阅读模式

英文:

Creating new columns from existing column in python

问题

我有一个类似以下的数据框架：

data = [['A', 1, 100], ['A', 3, 100], ['A', 2, 100], ['A', 3, 100], ['A', 5, 100]]
df = pd.DataFrame(data, columns=['?', 'Rating', 'Amount'])

我需要基于评分值创建新列，并将金额代入，看起来像这样：

	?	Rating	Amount	1	2	3	5
0	A	1	    100	    100	0	0	0
1	A	3	    100	    0	0	100	0
2	A	2	    100	    0	100	0	0
3	A	3	    100	    0	0	100	0
4	A	5	    100	    0	0	0	100

目前我有以下代码：

ratingnames = np.unique(list(df['Rating']))
ratingnames.sort()
d = pd.DataFrame(0, index=np.arange(len(df['Rating'])), columns=ratingnames)
for i in range(len(df['Rating'])):
    ratingvalue = df.loc[i, 'Rating']
    d.loc[i, ratingvalue] = df.loc[i, 'Amount']
df = pd.concat([df, d], axis=1)

但我觉得还可以改进。有什么建议吗？谢谢！

以下是改进后的代码：

# 使用pivot_table函数进行数据透视
df_pivot = df.pivot_table(index=['?', 'Rating'], columns='Rating', values='Amount', fill_value=0).reset_index()
# 重置列名
df_pivot.columns.name = None
# 重置索引并重命名列
df_pivot = df_pivot.reset_index().rename_axis(None, axis=1)
# 合并数据框架
result_df = df.merge(df_pivot, on=['?', 'Rating']).fillna(0)

这个代码更简洁和高效，使用了DataFrame的pivot_table函数来进行数据透视，然后将结果合并回原始数据框架中。

英文:

I have a dataframe that looks something like this:

data = [[&#39;A&#39;, 1, 100], [&#39;A&#39;, 3, 100], [&#39;A&#39;, 2, 100], [&#39;A&#39;, 3, 100], [&#39;A&#39;, 5, 100]]
df =  pd.DataFrame(data, columns = [&#39;?&#39;, &#39;Rating&#39;, &#39;Amount&#39;])

	?	Rating	Amount
0	A	1	    100
1	A	3	    100
2	A	2	    100
3	A	3	    100
4	A	5	    100

and I need to create new columns based on the Rating value substituting in the amount - looks something like this:

	?	Rating	Amount	1	2	3	5
0	A	1   	100 	100	0	0	0
1	A	3   	100 	0	0	100	0
2	A	2   	100 	0	100	0	0
3	A	3   	100 	0	0	100	0
4	A	5   	100 	0	0	0	100

Right now I have this:

ratingnames = np.unique(list(df[&#39;Rating&#39;]))
ratingnames.sort()
d = pd.DataFrame(0, index=np.arange(len(df[&#39;Rating&#39;])), columns=ratingnames)
for i in range(len(df[&#39;Rating&#39;])):
    ratingvalue = df.loc[i, &#39;Rating&#39;]
    d.loc[i, ratingvalue] = df.loc[i, &#39;Amount&#39;]
df = pd.concat([df, d], axis = 1)

but I feel like it could be improved upon. Any suggestions? Thanks!

答案1

得分: 2

使用get_dummies函数，并与df['Amount']相乘，然后在axis=1上进行concat：

output = pd.concat((df, pd.get_dummies(df['Rating']).mul(df['Amount'], axis=0)), axis=1)

    ?  Rating  Amount    1    2    3    5
0  A       1     100  100    0    0    0
1  A       3     100    0    0  100    0
2  A       2     100    0  100    0    0
3  A       3     100    0    0  100    0
4  A       5     100    0    0    0  100

时间：

英文:

IIUC, use get_dummies and multiply with df['Amount'], then concat on axis=1:

output = pd.concat((df,pd.get_dummies(df[&#39;Rating&#39;]).mul(df[&#39;Amount&#39;],axis=0)),axis=1)

   ?  Rating  Amount    1    2    3    5
0  A       1     100  100    0    0    0
1  A       3     100    0    0  100    0
2  A       2     100    0  100    0    0
3  A       3     100    0    0  100    0
4  A       5     100    0    0    0  100

Timings:

答案2

得分: 1

这将起作用：

df=pd.concat([df, df.apply(lambda x: pd.Series({x["Rating"]: x["Amount"]}), axis=1).fillna(0).astype("int")], axis=1)

输出：

   ?  Rating  Amount    1    2    3    5
0  A       1     100  100    0    0    0
1  A       3     100    0    0  100    0
2  A       2     100    0  100    0    0
3  A       3     100    0    0  100    0
4  A       5     100    0    0    0  100

英文:

This will do the trick:

df=pd.concat([df, df.apply(lambda x: pd.Series({x[&quot;Rating&quot;]: x[&quot;Amount&quot;]}), axis=1).fillna(0).astype(&quot;int&quot;)], axis=1)

Output:

   ?  Rating  Amount    1    2    3    5
0  A       1     100  100    0    0    0
1  A       3     100    0    0  100    0
2  A       2     100    0  100    0    0
3  A       3     100    0    0  100    0
4  A       5     100    0    0    0  100

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从现有列中创建新列在Python中。

问题

答案1

答案2

将访问权限转移到Telegram频道

矩形在触地时振动。

如何在测试之间暂停执行常见任务，然后继续进行测试断言。

为什么在 for 循环中需要使用新变量 “head”？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。