2023年8月10日 10:36:59go评论128阅读模式

英文:

Creating a for loop using list of specific pairs of columns

问题

I have a large data frame with many columns. I am trying to write a for loop that will do a couple of simple calculations between columns, but the columns must be specific, and I am identifying them based on location in the data frame. For example, I want to do the calculation between Column 8 and Column 1, between Column 8 and Column 7, etc.

What is the best way to create a list of the operations to be done, and call upon that list in a for loop?

I have this so far (just doing the operation manually, repeating a lot of code):

import numpy as np
import pandas as pd
data = [[99,3,12,4,63,55,67,32,15,102,87,34,82,102,99,30,99,1]]
cols_m = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23','4. FY24','5. FY25','6. FY26','7. FY27','8. FY28','9. FY29'],['Values','Sites']])
df = pd.DataFrame(data, columns = cols_m)
cols = df.columns.get_level_values(0).unique()
first_col = df.xs(cols[1], level=0, axis=1)
second_col = df.xs(cols[8], level=0, axis=1)
d = second_col - first_col
e = (second_col/first_col - 1) * 100
d = pd.concat({"{}-{}".format(cols[8], cols[1]): d}, axis=1)
e = pd.concat({"{}-{} %Change".format(cols[8], cols[1]): e}, axis=1)
df = pd.concat([df, d, e], axis=1)
del first_col, second_col, d, e
first_col = df.xs(cols[7], level=0, axis=1)
second_col = df.xs(cols[8], level=0, axis=1)
d = second_col - first_col
e = (second_col/first_col - 1) * 100
d = pd.concat({"{}-{}".format(cols[8], cols[7]): d}, axis=1)
e = pd.concat({"{}-{} %Change".format(cols[8], cols[7]): e}, axis=1)
df = pd.concat([df, d, e], axis=1)
and on and on, with different columns inserted...

I would ideally like to have something like below (same output), but I am not sure how to create the list:

my_list = [(8, 1), (8, 7)]   #etc. etc. 
all_dfs = []
for i, j in my_list: 
     first_col = df.xs(cols[i], level=0, axis=1)
     second_col = df.xs(cols[j], level=0, axis=1)
     d = second_col - first_col
     e = (second_col/first_col - 1) * 100
     d = pd.concat({"{}-{}".format(cols[j], cols[i]): d}, axis=1)
     e = pd.concat({"{}-{} %Change".format(cols[j], cols[i]): e}, axis=1)
     df = pd.concat([df, d, e], axis=1)

英文:

What is the best way to create a list of the operations to be done, and call upon that list in a for loop?

I have this so far (just doing the operation manually, repeating a lot of code):


import numpy as np
import pandas as pd
data = [[99,3,12,4,63,55,67,32,15,102,87,34,82,102,99,30,99,1]]
cols_m = pd.MultiIndex.from_product([[&#39;1. FY21&#39;,&#39;2. FY22&#39;,&#39;3. FY23&#39;,&#39;4. FY24&#39;,&#39;5. FY25&#39;,&#39;6. FY26&#39;,&#39;7. FY27&#39;,&#39;8. FY28&#39;,&#39;9. FY29&#39;],[&#39;Values&#39;,&#39;Sites&#39;]])
df = pd.DataFrame(data, columns = cols_m)
cols = df.columns.get_level_values(0).unique()
first_col = df.xs(cols[1], level=0, axis=1)
second_col = df.xs(cols[8], level=0, axis=1)
d = second_col - first_col
e = (second_col/first_col - 1) * 100
d = pd.concat({f&quot;{cols[8]}-{cols[1]}&quot;: d}, axis=1)
e = pd.concat({f&quot;{cols[8]}-{cols[1]} %Change&quot;: e}, axis=1)
df = pd.concat([df, d, e], axis=1)
del first_col, second_col, d, e
first_col = df.xs(cols[7], level=0, axis=1)
second_col = df.xs(cols[8], level=0, axis=1)
d = second_col - first_col
e = (second_col/first_col - 1) * 100
d = pd.concat({f&quot;{cols[8]}-{cols[7]}&quot;: d}, axis=1)
e = pd.concat({f&quot;{cols[8]}-{cols[7]} %Change&quot;: e}, axis=1)
df = pd.concat([df, d, e], axis=1)
and on and on, with different columns inserted...

I would ideally like to have something like below (same output), but I am not sure how to create the list:

list = {col[8] - col[1], col[8] - col[7]}   #etc. etc. 
all_dfs = []
for i, j in list: 
     first_col = df.xs(cols[i], level=0, axis=1)
     second_col = df.xs(cols[j], level=0, axis=1)
     d = second_col - first_col
     e = (second_col/first_col - 1) * 100
     d = pd.concat({f&quot;{cols[j]}-{cols[i]}&quot;: d}, axis=1)
     e = pd.concat({f&quot;{cols[j]}-{cols[i]} %Change&quot;: e}, axis=1)
     df = pd.concat([df, d, e], axis=1)

答案1

得分: 1

以下是翻译好的部分：

可以使用元组列表：
```python
pairs = [(8, 1), (8, 7)]
l = [df]
for i, j in pairs:
    first_col = df.xs(cols[j], level=0, axis=1)
    second_col = df.xs(cols[i], level=0, axis=1)
    d = second_col - first_col
    e = (second_col/first_col - 1) * 100
    l.append(pd.concat({f"{cols[i]}-{cols[j]}": d,
                        f"{cols[i]}-{cols[j]} %Change": e},
                      axis=1)
             )
out = pd.concat(l, axis=1)

输出：

  1. FY21       2. FY22       3. FY23       4. FY24       5. FY25       6. FY26       7. FY27       8. FY28       9. FY29       9. FY29-2. FY22       9. FY29-2. FY22 %Change       9. FY29-2. FY22       9. FY29-2. FY22 %Change       9. FY29-8. FY28       9. FY29-8. FY28 %Change           
   Values Sites  Values Sites  Values Sites  Values Sites  Values Sites  Values Sites  Values Sites  Values Sites  Values Sites          Values Sites                  Values Sites          Values Sites                  Values Sites          Values Sites                  Values      Sites
0      99     3      12     4      63    55      67    32      15   102      87    34      82   102      99    30      99     1              87    -3                   725.0 -75.0              87    -3                   725.0 -75.0               0   -29                     0.0 -96.666667


<details>
<summary>英文:</summary>
You can use a list of tuples:

pairs = [(8, 1), (8, 7)]

l = [df]
for i, j in pairs:
first_col = df.xs(cols[j], level=0, axis=1)
second_col = df.xs(cols[i], level=0, axis=1)
d = second_col - first_col
e = (second_col/first_col - 1) * 100
l.append(pd.concat({f"{cols[i]}-{cols[j]}": d,
f"{cols[i]}-{cols[j]} %Change": e},
axis=1)
)

out = pd.concat(l, axis=1)

Output:

FY21 2. FY22 3. FY23 4. FY24 5. FY25 6. FY26 7. FY27 8. FY28 9. FY29 9. FY29-2. FY22 9. FY29-2. FY22 %Change 9. FY29-2. FY22 9. FY29-2. FY22 %Change 9. FY29-8. FY28 9. FY29-8. FY28 %Change
Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites Values Sites
0 99 3 12 4 63 55 67 32 15 102 87 34 82 102 99 30 99 1 87 -3 725.0 -75.0 87 -3 725.0 -75.0 0 -29 0.0 -96.666667


</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用特定列对的列表创建for循环

问题

答案1

需要帮助创建一个Python循环。

你需要改变什么，以便我的龙卷风代码可以成功发布？

将一个数据框中以列值作为列名的数据进行转换（R语言）。

如何在Julia中对DataFrame执行线性回归？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。