英文:
Python pivot on dataframe and return values as strings
问题
我有一个数据框,包括userID、Item和Score。我想要在ItemID上进行数据透视,并在值字段上获取得分最高的物品。是否可以在透视结果中获取物品。得分我可以使用Max/Mean等方法获得,但我无法弄清如何获取字符串值。
这是我的数据框:
这是我想要实现的效果:
英文:
I have a Dataframe, userID,Item and Score. I would like to get a pivot on ItemID and have top scored Items on values field. Is it possible to get Items on pivot result. Scores I can get with Max/Mean and similar methods but I couldnt figure out how to get string values.
Here is my Dataframe:
This is what I'm trying to achive:
答案1
得分: 0
此代码将为每个用户ID创建一行,并按分数的顺序创建项目列。
如果用户ID的行数不相等,那么NaN将填充空白。
一个问题是如果在分数列中出现并列,那么应该发生什么;在我上面写的代码中,我使用数据帧中的第一行。这意味着在每个用户ID组内,偏好列始终为1、2、3、4...,这导致了偏好列的更漂亮的呈现。
英文:
This code will create a row for each userid and create columns for items in the order of the score
df = pd.DataFrame({'userid': ['A', 'A', 'A', 'B', 'B'],
'item': ['ford', 'renault', 'fiat', 'ford', 'jaguar'],
'score': [1, 5, 1, 4, 2]})
df['preference'] = df.groupby(['userid'])['score'].rank(method="first", ascending=False)
df_pivot = df.pivot(index='userid', columns="preference", values="item")
print(df_pivot)
If there are unequal rows for the userid then a NaN will fill the gaps.
One question is what should happen if there is a tie in the score column; in the code I wrote above I use the first row in the data frame. This means that the preference column is always 1, 2, 3, 4... within each userid group, which leads to a nicer presentation of the preference columns.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论