用向前填充替换高值

huangapple go评论82阅读模式
英文:

Replace high values with forward filling

问题

我有一个数据框,其中一些特征包含非常高的异常值。我想要去除那些突然出现的非常高的值。

ax.plot(df['Temperature'])

为了减轻这种影响,我使用了clip函数,根据分位数进行裁剪,但效果不如我希望的好。

ax.plot(df['Temperature'].clip(lower=df['Temperature'].quantile(0.05), upper=df['Temperature'].quantile(0.95)))

我如何用前向填充来替换这些(非常高的)值?如果温度从df('Temperature')[100]跳到df('Temperature')[120],然后用df('Temperature')[99]替换这些值。

英文:

I have a dataframe of which some features contain very high outliers. I would like to get rid of those sudden very high values

ax.plot(df['Temperature'])

用向前填充替换高值

To lessen this effect i used clip depending on the quantiles, but it does not work as good as i would like.

ax.plot(df['Temperature'].clip(lower=df['Temperature'].quantile(0.05), upper=df['Temperature'].quantile(0.95)))

用向前填充替换高值

How can i replace these (very high) values with their previous ones with forward filling? If the Temperature jump at df('Temperature')[100] until df('Temperature')[120] then replace these values with df('Temperature')[99]

答案1

得分: 1

也许可以将无效的索引设置为 NaN,然后使用 fillna 来填充它们?

>>> seq = np.arange(0, 10)
>>> seq[4:7] *= 100
>>> df = pd.DataFrame(seq, columns=['temp'])
   temp
0     0
1     1
2     2
3     3
4   400
5   500
6   600
7     7
8     8
9     9
>>> df[df.temp>=300] = np.nan  # 根据需要调整条件
   temp
0   0.0
1   1.0
2   2.0
3   3.0
4   NaN
5   NaN
6   NaN
7   7.0
8   8.0
9   9.0
>>> df.fillna(method='backfill')
   temp
0   0.0
1   1.0
2   2.0
3   3.0
4   7.0
5   7.0
6   7.0
7   7.0
8   8.0
9   9.0
英文:

Maybe NaN the indices that are invalid, then use fillna to backfill them?

>>> seq = np.arange(0, 10)
>>> seq[4:7] *= 100
>>> df = pd.DataFrame(seq, columns=['temp'])
   temp
0     0
1     1
2     2
3     3
4   400
5   500
6   600
7     7
8     8
9     9
>>> df[df.temp>=300] = np.nan  # adjust the condition accordingly
   temp
0   0.0
1   1.0
2   2.0
3   3.0
4   NaN
5   NaN
6   NaN
7   7.0
8   8.0
9   9.0
>>> df.fillna(method='backfill')
   temp
0   0.0
1   1.0
2   2.0
3   3.0
4   7.0
5   7.0
6   7.0
7   7.0
8   8.0
9   9.0

huangapple
  • 本文由 发表于 2020年1月6日 21:55:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/59613376.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定