如何将一个数据框按月份的天数进行分割?

huangapple go评论154阅读模式
英文:

How to divide a dataframe by the number of days in a month?

问题

我理解,你想要一个新的数据框,将原始数据框中的数值除以每个月的天数。你的代码中似乎有一些HTML编码和未定义的变量,但我可以提供一个简化版本的代码来执行这个任务:

  1. import pandas as pd
  2. # 假设你已经有了原始数据框 df
  3. # 将 "Time" 列转换为日期时间类型
  4. df['Time'] = pd.to_datetime(df['Time'])
  5. # 计算每个月的天数
  6. df['Days_in_Month'] = df['Time'].dt.daysinmonth
  7. # 将 var1 和 var2 列除以每个月的天数
  8. df['var1_new'] = df['var1'] / df['Days_in_Month']
  9. df['var2_new'] = df['var2'] / df['Days_in_Month']
  10. # 删除中间列 "Days_in_Month"
  11. df = df.drop('Days_in_Month', axis=1)

这个代码会为你创建一个新的数据框,其中包含了 "var1_new" 和 "var2_new" 列,它们是原始数据除以每个月天数后的结果。

英文:

I have a Pandas data frame that looks like this below:

  1. var1 var2
  2. Time
  3. 2000-01-31 100 200
  4. 2000-02-28 340 210
  5. 2000-03-31 590 220
  6. ...
  7. 2001-10-31 1290 101
  8. 2001-11-30 1188 100
  9. 2001-12-31 1000 100

I would like to create a new data frame that divides the first data frame by the number of days in each month. For instances,
2000-01-31, divide by 31.
2000-02-28, divide by 28.
2001-11-30, divide by 30.
2001-12-31, divide by 31.
And so on. The new data frame would look something like this

  1. var1_new var2_new
  2. Time
  3. 2000-01-31 3.22 6.45
  4. 2000-02-28 12.14 7.5
  5. 2000-03-31 19.03 7.09
  6. ...
  7. 2001-10-31 41.61 3.26
  8. 2001-11-30 39.6 3.33
  9. 2001-12-31 32.26 3.23

I constructed a small little code where I first used a timedelta to index the data frame. I took the sum of each day and month. I then created another data frame that records the number of days in a month. Finally, I tried to divide the first data frame but the number of days in a month to get the above results. However, all I get is a bunch of NaN which I have no idea why.

  1. df.index = pd.to_timedelta(df["Time"], unit='s')
  2. df.index += datetime.strptime(initial_datetime, "%Y-%m-%d")
  3. df_D=df.resample("D").sum()
  4. df_M=df_D.resample("M").sum()
  5. df_M["Months"] = df_M.index
  6. df2["DiM"] = pd.DataFrame(pd.to_datetime(df_M["Months"]).dt.daysinmonth)
  7. df_M = df_M.drop(['Months'],axis=1)
  8. df_New = df_M / df2["DiM"]

答案1

得分: 1

假设您的索引(Time)已经是类型为pd.Timestamp

  1. df.apply(lambda col: col / df.index.days_in_month)
英文:

Assuming your index (Time) is already of type pd.Timestamp:

  1. df.apply(lambda col: col / df.index.days_in_month)

huangapple
  • 本文由 发表于 2023年6月9日 00:54:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76434126.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定