英文:
How do i group data of a dataframe based on certain value?
问题
我想对我的数据框进行分组,以便将时间戳列中具有相同小时的行(该列包含数据,例如2019-01-01 00:00:00.134721167,50,100,其中50是成本,100是百分比)的成本进行求和并计算平均值,以及百分比。
或者更具体地说,我需要为2天的信息生成48行,每小时一行,而现在我有超过500行。我该如何做到这一点?
英文:
I want to group my dataframe so that the rows with the same hour from timestamp column (which has data like 2019-01-01 00:00:00.134721167,50,100 where 50 is the cost, and 100 is percentage) have their cost summed and averaged, as well as percentage.
Or, to be more specific, i need to have 48 rows for 2 days of information, one for each hour, while now i have more than 500 rows. How do I do that?
答案1
得分: 1
以下是已翻译的内容:
这里有一种方法可以做到:
# 样本数据
df = pd.DataFrame({'date': pd.date_range("2019-01-01", freq='H', periods=10),
'cost': pd.np.random.randint(10, 100, 10)})
方法 1:
df.set_index('date').resample('H').sum()
方法 2:
df.groupby(pd.Grouper(key='date', freq='H'))['cost'].sum().reset_index()
英文:
Here's a way to do:
# sample data
df = pd.DataFrame({'date': pd.date_range("2019-01-01", freq='H', periods = 10),
'cost': pd.np.random.randint(10, 100, 10)})
Method 1:
df.set_index('date').resample('H').sum()
Method 2:
df.groupby(pd.Grouper(key='date', freq='H'))['cost'].sum().reset_index()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论