使用特定列对的列表创建for循环

huangapple go评论85阅读模式
英文:

Creating a for loop using list of specific pairs of columns

问题

I have a large data frame with many columns. I am trying to write a for loop that will do a couple of simple calculations between columns, but the columns must be specific, and I am identifying them based on location in the data frame. For example, I want to do the calculation between Column 8 and Column 1, between Column 8 and Column 7, etc.

What is the best way to create a list of the operations to be done, and call upon that list in a for loop?

I have this so far (just doing the operation manually, repeating a lot of code):

import numpy as np
import pandas as pd

data = [[99,3,12,4,63,55,67,32,15,102,87,34,82,102,99,30,99,1]]

cols_m = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23','4. FY24','5. FY25','6. FY26','7. FY27','8. FY28','9. FY29'],['Values','Sites']])

df = pd.DataFrame(data, columns = cols_m)

cols = df.columns.get_level_values(0).unique()

first_col = df.xs(cols[1], level=0, axis=1)
second_col = df.xs(cols[8], level=0, axis=1)
d = second_col - first_col
e = (second_col/first_col - 1) * 100
d = pd.concat({"{}-{}".format(cols[8], cols[1]): d}, axis=1)
e = pd.concat({"{}-{} %Change".format(cols[8], cols[1]): e}, axis=1)
df = pd.concat([df, d, e], axis=1)

del first_col, second_col, d, e
first_col = df.xs(cols[7], level=0, axis=1)
second_col = df.xs(cols[8], level=0, axis=1)
d = second_col - first_col
e = (second_col/first_col - 1) * 100
d = pd.concat({"{}-{}".format(cols[8], cols[7]): d}, axis=1)
e = pd.concat({"{}-{} %Change".format(cols[8], cols[7]): e}, axis=1)
df = pd.concat([df, d, e], axis=1)

and on and on, with different columns inserted...

I would ideally like to have something like below (same output), but I am not sure how to create the list:

my_list = [(8, 1), (8, 7)]   #etc. etc. 

all_dfs = []
for i, j in my_list: 
     first_col = df.xs(cols[i], level=0, axis=1)
     second_col = df.xs(cols[j], level=0, axis=1)
     d = second_col - first_col
     e = (second_col/first_col - 1) * 100
     d = pd.concat({"{}-{}".format(cols[j], cols[i]): d}, axis=1)
     e = pd.concat({"{}-{} %Change".format(cols[j], cols[i]): e}, axis=1)
     df = pd.concat([df, d, e], axis=1)

使用特定列对的列表创建for循环

英文:

I have a large data frame with many columns. I am trying to write a for loop that will do a couple of simple calculations between columns, but the columns must be specific, and I am identifying them based on location in the data frame. For example, I want to do the calculation between Column 8 and Column 1, between Column 8 and Column 7, etc.

What is the best way to create a list of the operations to be done, and call upon that list in a for loop?

I have this so far (just doing the operation manually, repeating a lot of code):



import numpy as np
import pandas as pd

data = [[99,3,12,4,63,55,67,32,15,102,87,34,82,102,99,30,99,1]]

cols_m = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23','4. FY24','5. FY25','6. FY26','7. FY27','8. FY28','9. FY29'],['Values','Sites']])

df = pd.DataFrame(data, columns = cols_m)

cols = df.columns.get_level_values(0).unique()

first_col = df.xs(cols[1], level=0, axis=1)
second_col = df.xs(cols[8], level=0, axis=1)
d = second_col - first_col
e = (second_col/first_col - 1) * 100
d = pd.concat({f"{cols[8]}-{cols[1]}": d}, axis=1)
e = pd.concat({f"{cols[8]}-{cols[1]} %Change": e}, axis=1)
df = pd.concat([df, d, e], axis=1)



del first_col, second_col, d, e
first_col = df.xs(cols[7], level=0, axis=1)
second_col = df.xs(cols[8], level=0, axis=1)
d = second_col - first_col
e = (second_col/first_col - 1) * 100
d = pd.concat({f"{cols[8]}-{cols[7]}": d}, axis=1)
e = pd.concat({f"{cols[8]}-{cols[7]} %Change": e}, axis=1)
df = pd.concat([df, d, e], axis=1)

and on and on, with different columns inserted...

使用特定列对的列表创建for循环

I would ideally like to have something like below (same output), but I am not sure how to create the list:

list = {col[8] - col[1], col[8] - col[7]}   #etc. etc. 


all_dfs = []
for i, j in list: 
     first_col = df.xs(cols[i], level=0, axis=1)
     second_col = df.xs(cols[j], level=0, axis=1)
     d = second_col - first_col
     e = (second_col/first_col - 1) * 100
     d = pd.concat({f"{cols[j]}-{cols[i]}": d}, axis=1)
     e = pd.concat({f"{cols[j]}-{cols[i]} %Change": e}, axis=1)
     df = pd.concat([df, d, e], axis=1)

答案1

得分: 1

以下是翻译好的部分:

可以使用元组列表
```python
pairs = [(8, 1), (8, 7)]

l = [df]
for i, j in pairs:
    first_col = df.xs(cols[j], level=0, axis=1)
    second_col = df.xs(cols[i], level=0, axis=1)
    d = second_col - first_col
    e = (second_col/first_col - 1) * 100
    l.append(pd.concat({f"{cols[i]}-{cols[j]}": d,
                        f"{cols[i]}-{cols[j]} %Change": e},
                      axis=1)
             )

out = pd.concat(l, axis=1)

输出:

  1. FY21       2. FY22       3. FY23       4. FY24       5. FY25       6. FY26       7. FY27       8. FY28       9. FY29       9. FY29-2. FY22       9. FY29-2. FY22 %Change       9. FY29-2. FY22       9. FY29-2. FY22 %Change       9. FY29-8. FY28       9. FY29-8. FY28 %Change           
   Values Sites  Values Sites  Values Sites  Values Sites  Values Sites  Values Sites  Values Sites  Values Sites  Values Sites          Values Sites                  Values Sites          Values Sites                  Values Sites          Values Sites                  Values      Sites
0      99     3      12     4      63    55      67    32      15   102      87    34      82   102      99    30      99     1              87    -3                   725.0 -75.0              87    -3                   725.0 -75.0               0   -29                     0.0 -96.666667

<details>
<summary>英文:</summary>

You can use a list of tuples:

pairs = [(8, 1), (8, 7)]

l = [df]
for i, j in pairs:
first_col = df.xs(cols[j], level=0, axis=1)
second_col = df.xs(cols[i], level=0, axis=1)
d = second_col - first_col
e = (second_col/first_col - 1) * 100
l.append(pd.concat({f"{cols[i]}-{cols[j]}": d,
f"{cols[i]}-{cols[j]} %Change": e},
axis=1)
)

out = pd.concat(l, axis=1)

Output:
  1. FY21 2. FY22 3. FY23 4. FY24 5. FY25 6. FY26 7. FY27 8. FY28 9. FY29 9. FY29-2. FY22 9. FY29-2. FY22 %Change 9. FY29-2. FY22 9. FY29-2. FY22 %Change 9. FY29-8. FY28 9. FY29-8. FY28 %Change
    Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites
    0 99 3 12 4 63 55 67 32 15 102 87 34 82 102 99 30 99 1 87 -3 725.0 -75.0 87 -3 725.0 -75.0 0 -29 0.0 -96.666667

</details>



huangapple
  • 本文由 发表于 2023年8月10日 10:36:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/76872327.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定