使用Pandas中的`loc`方法忽略列表中的NaN元素。

huangapple go评论64阅读模式
英文:

Ignore nan elements in a list using loc pandas

问题

我有2个不同的数据框:df1,df2

df1:
索引 a
0 10
1 2
2 3
3 1
4 7
5 6

df2:
索引 a
0 1
1 2
2 4
3 3
4 20
5 5

我想在df1中找到具有特定回溯的最大值的索引(在此示例中,假设回溯=3)。为此,我使用以下代码:

tdf['a'] = df1.rolling(lookback).apply(lambda x: x.idxmax())

结果将是:

id a
0 nan
1 nan
2 0
3 2
4 4
5 4

现在我需要将idxmax()在tdf['b']中找到的每个索引中的值保存在df2中。

因此,如果tdf['a'].iloc[3] == 2,我希望tdf['b'].iloc[3] == df2.iloc[2]。我期望最终结果如下:

id b
0 nan
1 nan
2 1
3 4
4 20
5 20

我猜想可以使用.loc()函数来实现这一点,就像这样:

tdf['b'] = df2.loc[tdf['a']]

但它会引发异常,因为tdf['a']中有nan值。如果在将tdf['a']传递给.loc()函数之前使用dropna(),那么索引就会混乱(例如,在tdf['b']中,索引0必须是nan,但在dropna()之后它将有一个值)。

有没有办法获得我想要的结果?

英文:

I have 2 different dataframes: df1, df2

df1:
index a
0     10    
1     2    
2     3
3     1
4     7
5     6

df2:
index a
0     1    
1     2
2     4
3     3
4     20
5     5

I want to find the index of maximum values with a specific lookback in df1 (let's consider lookback=3 in this example). To do this, I use the following code:

tdf['a'] = df1.rolling(lookback).apply(lambda x: x.idxmax())

And the result would be:

id    a
0     nan    
1     nan
2     0
3     2
4     4
5     4

Now I need to save the values in df2 for each index found by idxmax() in tdf['b']

So if tdf['a'].iloc[3] == 2, I want tdf['b'].iloc[3] == df2.iloc[2]. I expect the final result to be like this:

id    b
0     nan    
1     nan
2     1
3     4
4     20
5     20

I'm guessing that I can do this using .loc() function like this:

tdf['b'] = df2.loc[tdf['a']]

But it throws an exception because there are nan values in tdf['a']. If I use dropna() before passing tdf['a'] to the .loc() function, then the indices get messed up (for example in tdf['b'], index 0 has to be nan but it'll have a value after dropna()).

Is there any way to get what I want?

答案1

得分: 1

只需使用 map 方法:

lookback = 3
s = df1['a'].rolling(lookback).apply(lambda x: x.idxmax())

s.map(df2['a'])

输出:

0     NaN
1     NaN
2     1.0
3     4.0
4    20.0
5    20.0
Name: a, dtype: float64
英文:

Simply use a map:

lookback = 3
s = df1['a'].rolling(lookback).apply(lambda x: x.idxmax())

s.map(df2['a'])

Output:

0     NaN
1     NaN
2     1.0
3     4.0
4    20.0
5    20.0
Name: a, dtype: float64

huangapple
  • 本文由 发表于 2023年2月8日 19:47:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/75385365.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定