Error while trying to shift values in dataframe and get the difference

huangapple go评论61阅读模式
英文:

Error while trying to shift values in dataframe and get the difference

问题

我创建了一个用户定义的函数来在数据框上执行特定任务。它检查一个列上的条件,并使用第二列的值来在第三列中给出结果。

以下是我编写的函数:

import numpy as np

def strk_inter(x):
   if x['SYMBOL'] == x['SYMBOL'].shift(1):
       a = x['STRIKE_PR'].shift(1) - x['STRIKE_PR']
   else :
       a = np.nan
    return a

optt_df['STRIKE_INTERVAL'] = optt_df.apply(strk_inter,axis=1)

它显示了一个错误:

AttributeError: 'str' object has no attribute 'shift'

数据集:

Error while trying to shift values in dataframe and get the difference

期望:

Error while trying to shift values in dataframe and get the difference

英文:

I made a user defined function to perform specific task on dataframe. It checks the condition on one column and uses the values of the second column to give the result in the third column.

Following is the function I wrote :

import numpy as np


def strk_inter(x):
   if x['SYMBOL'] == x['SYMBOL'].shift(1):
       a = x['STRIKE_PR'].shift(1) - x['STRIKE_PR']
   else :
       a = np.nan
    return a

optt_df['STRIKE_INTERVAL'] = optt_df.apply(strk_inter,axis=1)

It shows me an error of

> AttributeError: 'str' object has no attribute 'shift'

Dataset:

Error while trying to shift values in dataframe and get the difference

Expectation:

Error while trying to shift values in dataframe and get the difference

答案1

得分: 1

Pandas apply,使用 axis=1,将行传递给您的函数,因此 x['SYMBOL'] 是该特定行中 SYMBOL 的值,而不是整个可移动的整列,因此出现了错误消息。

使用您提供的数据帧:

import pandas as pd

df = pd.DataFrame({"SYMBOL": ["A", "A", "B", "B"], "STRIKE_PR": [1000, 1100, 950, 960]})

print(df)
# 输出

  SYMBOL  STRIKE_PR
0      A       1000
1      A       1100
2      B        950
3      B        960

这是获取预期结果的一种方法:

def strk_inter(df_):
    for i in df_[df_["SYMBOL"] == df_.shift(-1)["SYMBOL"]].index:
        df_.at[i, "STRIKE_INTERVAL"] = (
            df_.at[i + 1, "STRIKE_PR"] - df_.at[i, "STRIKE_PR"]
        )
    return df_


print(strk_inter(df))
# 输出

  SYMBOL  STRIKE_PR  STRIKE_INTERVAL
0      A       1000            100.0
1      A       1100              NaN
2      B        950             10.0
3      B        960              NaN
英文:

Pandas apply, with axis=1, is passing rows to your function, so x['SYMBOL'] is the value of the SYMBOL in that particular row, not the whole column that you can shift, hence the error message.

With the dataframe you provided:

import pandas as pd

df = pd.DataFrame({"SYMBOL": ["A", "A", "B", "B"], "STRIKE_PR": [1000, 1100, 950, 960]})

print(df)
# Output

  SYMBOL  STRIKE_PR
0      A       1000
1      A       1100
2      B        950
3      B        960

Here is one way to get the expected result:

def strk_inter(df_):
    for i in df_[df_["SYMBOL"] == df_.shift(-1)["SYMBOL"]].index:
        df_.at[i, "STRIKE_INTERVAL"] = (
            df_.at[i + 1, "STRIKE_PR"] - df_.at[i, "STRIKE_PR"]
        )
    return df_


print(strk_inter(df))
# Output

  SYMBOL  STRIKE_PR  STRIKE_INTERVAL
0      A       1000            100.0
1      A       1100              NaN
2      B        950             10.0
3      B        960              NaN

huangapple
  • 本文由 发表于 2023年6月29日 18:01:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/76580008.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定