基于前一个和后一个值填充中间数值

huangapple go评论64阅读模式
英文:

Pad middle values based on previouse and next values

问题

# 填充缺失的中间值以使数据框看起来像这样
import pandas as pd

L = [0, 3, 5, 7, 9]
L2 = ['Repeat1', 'Repeat2', 'Repeat3', 'Repeat4', 'Repeat5']

df = pd.DataFrame({'col': L})
df['col2'] = L2

# 生成连续的数字序列
all_values = list(range(df['col'].min(), df['col'].max() + 1))

# 重新索引数据框以包含所有值
df = df.reindex(all_values).ffill().reset_index(drop=True)
df.columns = ['col', 'col2']

print(df)
英文:

Let's say I've df Like this

   col     col2
0    0  Repeat1
1    3  Repeat2
2    5  Repeat3
3    7  Repeat4
4    9  Repeat5

Reproducable

L= [0,3,5,7,9]
L2 = ['Repeat1','Repeat2','Repeat3','Repeat4','Repeat5']

import pandas as pd
df = pd.DataFrame({'col':L})
df['col2']= L2
print (df)

How can fill missing intermidaite values such that my df will looks like this

  col     col2
0    0  Repeat1
1    1  Repeat1
2    2  Repeat1
3    3  Repeat2
4    4  Repeat2
5    5  Repeat3
6    6  Repeat3
7    7  Repeat4
8    8  Repeat4
9    9  Repeat5

Similar threads I've tried

https://stackoverflow.com/questions/37821653/filling-missing-middle-values-in-pandas-dataframe (Filling Nan values for intermediate values but I don't need Nan)

https://stackoverflow.com/questions/28798076/fill-pandas-dataframe-with-values-in-between (Very Big approch. I'm looking any functional appraoch)

Both cases helped me some extent But i was wondering is any ways to do it? 基于前一个和后一个值填充中间数值

答案1

得分: 3

输出:

您可以使用"col"作为临时索引进行重新索引(reindex)和前向填充(ffill)
out = (df.set_index('col')
         .reindex(range(df['col'].max()+1))
         .ffill()
         .reset_index()
      )

输出:

   col     col2
0    0  Repeat1
1    1  Repeat1
2    2  Repeat1
3    3  Repeat2
4    4  Repeat2
5    5  Repeat3
6    6  Repeat3
7    7  Repeat4
8    8  Repeat4
9    9  Repeat5
英文:

You can reindex and ffill with "col" as temporary index:

out = (df.set_index('col')
         .reindex(range(df['col'].max()+1))
         .ffill()
         .reset_index()
      )

Output:

   col     col2
0    0  Repeat1
1    1  Repeat1
2    2  Repeat1
3    3  Repeat2
4    4  Repeat2
5    5  Repeat3
6    6  Repeat3
7    7  Repeat4
8    8  Repeat4
9    9  Repeat5

答案2

得分: 1

你也可以使用 mergeffill

(df.merge(pd.DataFrame({'col': range(df['col'].max()+1)}), how='right')
   .ffill()
)

输出:

       col     col2
    0    0  Repeat1
    1    1  Repeat1
    2    2  Repeat1
    3    3  Repeat2
    4    4  Repeat2
    5    5  Repeat3
    6    6  Repeat3
    7    7  Repeat4
    8    8  Repeat4
    9    9  Repeat5
英文:

You can also merge and ffill

(df.merge(pd.DataFrame({'col': range(df['col'].max()+1)}), how='right')
       .ffill()
    )

Output:

   col     col2
0    0  Repeat1
1    1  Repeat1
2    2  Repeat1
3    3  Repeat2
4    4  Repeat2
5    5  Repeat3
6    6  Repeat3
7    7  Repeat4
8    8  Repeat4
9    9  Repeat5

答案3

得分: 1

另一个可能的解决方案,基于pandas.concat

pd.concat([pd.DataFrame({'col': range(df['col'].max()+1)}),
            df.set_index('col')], axis=1).ffill()

或者,另一种方法:

(pd.concat([df, pd.DataFrame(
    {'col': list(set(range(1, df.col.max()+1)).difference(df.col))})])
 .sort_values('col').ffill().reset_index(drop=True))

输出:

   col     col2
0    0  Repeat1
1    1  Repeat1
2    2  Repeat1
3    3  Repeat2
4    4  Repeat2
5    5  Repeat3
6    6  Repeat3
7    7  Repeat4
8    8  Repeat4
9    9  Repeat5
英文:

Another possible solution, which is based on pandas.concat:

pd.concat([pd.DataFrame({'col': range(df['col'].max()+1)}),
            df.set_index('col')], axis=1).ffill()

Or, alternatively:

(pd.concat([df, pd.DataFrame(
    {'col': list(set(range(1, df.col.max()+1)).difference(df.col))})])
 .sort_values('col').ffill().reset_index(drop=True))

Output:

   col     col2
0    0  Repeat1
1    1  Repeat1
2    2  Repeat1
3    3  Repeat2
4    4  Repeat2
5    5  Repeat3
6    6  Repeat3
7    7  Repeat4
8    8  Repeat4
9    9  Repeat5

huangapple
  • 本文由 发表于 2023年2月14日 22:37:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/75449378.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定