英文:
Resampling of a DataFrame by 1 Hour in pandas gives unexpexted NaN values
问题
在pandas中按1小时重新采样DataFrame会产生意外的NaN值
我有一个包含3列的DataFrame。第一列 包含 日期 (如 2020-07-01、2020-07-01...);第二列 包含一个月内按小时间隔的 时间(如 00:00:00、01:00:00...);第三列 包含一个变量的对应值,包括DataFrame中的一些 缺失行(即缺失数据)。还有一些第二列(时间)中的值,如 15:06:55、16:00:01 等。
我想要 按1小时重新采样DataFrame,并只在缺失数据的地方填充NaN值。在我的情况下,重新采样会在缺失数据的位置以及时间为 15:06:55、16:00:01 等的地方产生NaN值。请帮我解决这个问题。
提前感谢您。
df['Date-Time'] = pd.to_datetime(df[0] + df[1], format='%Y-%m-%d%H:%M:%S')
df = df.set_index('Date-Time')
df = df.resample('1H').fillna(method=None)
这段代码会在缺失数据的位置以及时间为 15:06:55、16:00:01、17:00:01 等的地方产生NaN值。我想要按1小时重新采样DataFrame,并 只在缺失数据的位置填充NaN值。我已上传了重新采样前的DataFrame的图像。请帮我解决这个问题。
提前感谢您。我已上传了重新采样前的DataFrame的图像。
英文:
Resampling of a DataFrame by 1 Hour in pandas gives unexpected NaN values
I have a dataframe having 3 columns. 1st Column contains date ( like 2020-07-01,2020-07-01...); 2nd column contains time ( like 00:00:00, 01:00:00...) for one month on hourly basis and 3rd column contains the corresponding values of a variable including some missing rows (i.e., missing data) in the dataframe. Also some values in the 2nd column (time) is like 15:06:55, 16:00:01 etc.
I want to resample the dataframe by 1 Hour and fill NaN values only in place of the missing data. In my case, Resampling gives NaN values to the missing data place as well as where the time is like 15:06:55, 16:00:01 etc. Please help me to solve the issue.
Thanks in advance.
df['Date-Time'] = pd.to_datetime(df[0] + df[1],format='%Y-%m-%d%H:%M:%S')
df = df.set_index('Date-Time')
df = df.resample('1H').fillna(method=None)
This code gives NaN values in place of missing data as well as where the time is like 15:06:55, 16:00:01, 17:00:01 etc. I want to resample the dataframe by 1 Hour and fill NaN values only in place of the missing data. I have uploaded an image of the dataframe before resampling. Please help me to solve the issue.
Thanks in advance.I have uploaded an image of the dataframe before resampling.
答案1
得分: 1
你可以使用fillna(method=None)
方法来填充缺失数据,因此可以明确地用NaN值来填充它。请查看pandas文档。
你可以使用插值或填充方法来填充缺失数据。
例如:
df = df.resample('1H').ffill()
或者
df = df.resample('1H').interpolate(method='bfill')
或者你可以使用fillna()
方法,如果在方法参数中提供了backfill
、bfill
或ffill
。
-> 请查看interpolate()
文档。
英文:
You use the fillna(method=None)
method to fill the missing data. So you fill it with NaN values explicitly. See the pandas documentation.
You can use an interpolation or fill method to fill the missing data.
e.g.:
df = df.resample('1H').ffill()
or
df = df.resample('1H').interpolate(method='bfill')
or you fill it with the fillna()
method, if you provide backfill
, bfill
of ffill
in the method-argument.
-> look at the interpolate()
documentation
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论