如何根据特定值对数据框中的数据进行分组?

huangapple go评论73阅读模式
英文:

How do i group data of a dataframe based on certain value?

问题

我想对我的数据框进行分组,以便将时间戳列中具有相同小时的行(该列包含数据,例如2019-01-01 00:00:00.134721167,50,100,其中50是成本,100是百分比)的成本进行求和并计算平均值,以及百分比。

或者更具体地说,我需要为2天的信息生成48行,每小时一行,而现在我有超过500行。我该如何做到这一点?

英文:

I want to group my dataframe so that the rows with the same hour from timestamp column (which has data like 2019-01-01 00:00:00.134721167,50,100 where 50 is the cost, and 100 is percentage) have their cost summed and averaged, as well as percentage.

Or, to be more specific, i need to have 48 rows for 2 days of information, one for each hour, while now i have more than 500 rows. How do I do that?

答案1

得分: 1

以下是已翻译的内容:

这里有一种方法可以做到:

# 样本数据
df = pd.DataFrame({'date': pd.date_range("2019-01-01", freq='H', periods=10),
                   'cost': pd.np.random.randint(10, 100, 10)})

方法 1:

df.set_index('date').resample('H').sum()

方法 2:

df.groupby(pd.Grouper(key='date', freq='H'))['cost'].sum().reset_index()
英文:

Here's a way to do:

# sample data
df = pd.DataFrame({'date': pd.date_range("2019-01-01", freq='H', periods = 10),
                  'cost': pd.np.random.randint(10, 100, 10)})

Method 1:

df.set_index('date').resample('H').sum()

Method 2:

df.groupby(pd.Grouper(key='date', freq='H'))['cost'].sum().reset_index()

huangapple
  • 本文由 发表于 2020年1月6日 02:46:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/59603093.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定