如何将一个数据框按月份的天数进行分割?

huangapple go评论110阅读模式
英文:

How to divide a dataframe by the number of days in a month?

问题

我理解,你想要一个新的数据框,将原始数据框中的数值除以每个月的天数。你的代码中似乎有一些HTML编码和未定义的变量,但我可以提供一个简化版本的代码来执行这个任务:

import pandas as pd

# 假设你已经有了原始数据框 df

# 将 "Time" 列转换为日期时间类型
df['Time'] = pd.to_datetime(df['Time'])

# 计算每个月的天数
df['Days_in_Month'] = df['Time'].dt.daysinmonth

# 将 var1 和 var2 列除以每个月的天数
df['var1_new'] = df['var1'] / df['Days_in_Month']
df['var2_new'] = df['var2'] / df['Days_in_Month']

# 删除中间列 "Days_in_Month"
df = df.drop('Days_in_Month', axis=1)

这个代码会为你创建一个新的数据框,其中包含了 "var1_new" 和 "var2_new" 列,它们是原始数据除以每个月天数后的结果。

英文:

I have a Pandas data frame that looks like this below:

	        var1    var2
Time		
2000-01-31	100	    200
2000-02-28	340     210
2000-03-31	590	    220
...
2001-10-31	1290	101
2001-11-30	1188	100
2001-12-31	1000	100

I would like to create a new data frame that divides the first data frame by the number of days in each month. For instances,
2000-01-31, divide by 31.
2000-02-28, divide by 28.
2001-11-30, divide by 30.
2001-12-31, divide by 31.
And so on. The new data frame would look something like this

	        var1_new   var2_new
Time		
2000-01-31	3.22	 6.45
2000-02-28	12.14	 7.5
2000-03-31	19.03	 7.09
...		
2001-10-31	41.61	 3.26
2001-11-30	39.6	 3.33
2001-12-31	32.26	 3.23

I constructed a small little code where I first used a timedelta to index the data frame. I took the sum of each day and month. I then created another data frame that records the number of days in a month. Finally, I tried to divide the first data frame but the number of days in a month to get the above results. However, all I get is a bunch of NaN which I have no idea why.

df.index = pd.to_timedelta(df["Time"], unit='s')
df.index += datetime.strptime(initial_datetime, "%Y-%m-%d")

df_D=df.resample("D").sum()
df_M=df_D.resample("M").sum()

df_M["Months"] = df_M.index 

df2["DiM"] = pd.DataFrame(pd.to_datetime(df_M["Months"]).dt.daysinmonth)
df_M = df_M.drop(['Months'],axis=1)

df_New = df_M / df2["DiM"]

答案1

得分: 1

假设您的索引(Time)已经是类型为pd.Timestamp

df.apply(lambda col: col / df.index.days_in_month)
英文:

Assuming your index (Time) is already of type pd.Timestamp:

df.apply(lambda col: col / df.index.days_in_month)

huangapple
  • 本文由 发表于 2023年6月9日 00:54:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76434126.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定