如何在Python中使用先前其他列的值来计算数值。

huangapple go评论98阅读模式
英文:

how to calculate value using previous other columns' value in python

问题

我想从下面的DataFrame中计算回报率。
使用其他列的前一行,甚至按id分组
具体来说,

  1. >>> df = pd.DataFrame({'id': ['Blue', 'Blue','Blue','Red','Red'],
  2. 'a':[100,200,300,1,2],
  3. 'b':[10,20,15,3,2],
  4. 'c':[1,2,3,4,5]})
  5. >>> df
  6. id a b c
  7. 0 Blue 100 10 1
  8. 1 Blue 200 20 2
  9. 2 Blue 300 15 3
  10. 3 Red 1 3 4
  11. 4 Red 2 2 5

我想要做以下操作。

  1. df['new_col'] = a / a(前一行的值) + b(前一行的值) - c(前一行的值)

我认为pct_change()不会帮助,因为它只适用于同一列。

  1. >>> df
  2. id a b c new_col
  3. 0 Blue 100 10 1 -
  4. 1 Blue 200 20 2 = 200 / (100 + 10 - 1)
  5. 2 Blue 300 15 3 = 300 / (200 + 20 - 2)
  6. 3 Red 1 3 4 -
  7. 4 Red 2 2 5 = 2 / (1 + 3 - 4)
英文:

I'd like to calculate rate of return from Dataframe belows.
Using other columns' previous row, even group by id
To be specific ,

From

  1. >>> df = pd.DataFrame({'id': ['Blue', 'Blue','Blue','Red','Red'],
  2. 'a':[100,200,300,1,2],
  3. 'b':[10,20,15,3,2],
  4. 'c':[1,2,3,4,5]})
  5. >>> df
  6. id a b c
  7. 0 Blue 100 10 1
  8. 1 Blue 200 20 2
  9. 2 Blue 300 15 3
  10. 3 Red 1 3 4
  11. 4 Red 2 2 5

I want to make following.

df['new_col'] = a / a(previous row value) + b(previous row value) - c(previous row value)

I think pct_change() doen't help since it works only same column.

  1. >>> df
  2. id a b c new_col
  3. 0 Blue 100 10 1 -
  4. 1 Blue 200 20 2 = 200 / (100 + 10 - 1)
  5. 2 Blue 300 15 3 = 300 / (200 + 20 - 2)
  6. 3 Red 1 3 4 -
  7. 4 Red 2 2 5 = 2 / (1 + 3 - 4)

答案1

得分: 0

尝试:

  1. def fn(g):
  2. out = g['a'] / (g['a'].shift() + g['b'].shift() - g['c'].shift())
  3. g['new_col'] = out
  4. return g
  5. df = df.groupby('id', group_keys=False).apply(fn)
  6. print(df)

打印:

  1. id a b c new_col
  2. 0 蓝色 100 10 1 NaN
  3. 1 蓝色 200 20 2 1.834862
  4. 2 蓝色 300 15 3 1.376147
  5. 3 红色 1 3 4 NaN
  6. 4 红色 2 2 5 inf
英文:

Try:

  1. def fn(g):
  2. out = g['a'] / (g['a'].shift() + g['b'].shift() - g['c'].shift())
  3. g['new_col'] = out
  4. return g
  5. df = df.groupby('id', group_keys=False).apply(fn)
  6. print(df)

Prints:

  1. id a b c new_col
  2. 0 Blue 100 10 1 NaN
  3. 1 Blue 200 20 2 1.834862
  4. 2 Blue 300 15 3 1.376147
  5. 3 Red 1 3 4 NaN
  6. 4 Red 2 2 5 inf

huangapple
  • 本文由 发表于 2023年7月27日 23:18:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/76781222.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定