在第二行按顺序对 Pandas 数据框列进行排序。

huangapple go评论93阅读模式
英文:

Sort pandas dataframe columns on second row order

问题

根据第二行的顺序对数据框进行排序,例如:

  1. import pandas as pd
  2. data = {'1a': ['C', 3, 1], '2b': ['B', 2, 3], '3c': ['A', 5, 2]}
  3. df = pd.DataFrame(data)
  4. df

输出结果:

  1. 1a 2b 3c
  2. 0 C B A
  3. 1 3 2 5
  4. 2 1 3 2

期望的输出:

  1. 3c 2b 1a
  2. 0 A B C
  3. 1 5 2 3
  4. 2 2 3 1

你可以使用以下方法实现这个目标:

  1. # 创建一个用于排序的列表
  2. mylist = ['A', 'B', 'C']
  3. # 按照mylist的顺序重新排列数据框的列
  4. df = df[mylist]
  5. # 输出结果
  6. df

这将按照mylist的顺序重新排列数据框的列,得到期望的输出:

  1. 3c 2b 1a
  2. 0 A B C
  3. 1 5 2 3
  4. 2 2 3 1
英文:

I need to sort a dataframe based on the order of the second row. For example:

  1. import pandas as pd
  2. data = {'1a': ['C', 3, 1], '2b': ['B', 2, 3], '3c': ['A', 5, 2]}
  3. df = pd.DataFrame(data)
  4. df

Output:

  1. 1a 2b 3c
  2. 0 C B A
  3. 1 3 2 5
  4. 2 1 3 2

Desired output:

  1. 3c 2b 1a
  2. 0 A B C
  3. 1 5 2 3
  4. 2 2 3 1

So the columns have been order based on the zero index row, on the A, B, C.

Have tried many sorting options without success.

Having a quick way to accomplish this would be beneficial, but having granular control to both order the elements and move a specific column to the first position would be even better. For example move "C" to the first column.

Something like make a list, sort, move and reorder on list.

  1. mylist = ['B', 'A', 'C']
  2. mylist.sort()
  3. mylist.insert(0, mylist.pop(mylist.index('C')))

Then sorting the dataframe on ['C', 'A', 'B'] outputting

  1. 1a 3c 2b
  2. 0 C A B
  3. 1 3 5 2
  4. 2 1 2 3

答案1

得分: 2

你可以尝试使用以下代码:

  1. df = df[df.iloc[0].sort_values().index]

返回结果如下:

  1. 3c 2b 1a
  2. 0 A B C
  3. 1 5 2 3
  4. 2 2 3 1

在这种情况下,你正在处理第一行数据,对其进行排序,然后返回排序后的索引值。你可以根据需要对多列/多行进行排序。

英文:

You can try with:

  1. df = df[df.iloc[0].sort_values().index]

Returning:

  1. 3c 2b 1a
  2. 0 A B C
  3. 1 5 2 3
  4. 2 2 3 1

In this case you are working with the first row, and sorting it to then return the index values in the sorted. You have a lot flexibility you can even sort by multiple columns/rows this way too.

答案2

得分: 1

如果您想将特定列移动到第一位置,您可以修改代码如下所示:

  1. import pandas as pd
  2. data = {"1a": ["C", 3, 1],
  3. "2b": ["B", 2, 3],
  4. "3c": ["A", 5, 2]}
  5. df = pd.DataFrame(data)
  6. print(df)
  7. # 1a 2b 3c
  8. # 0 C B A
  9. # 1 3 2 5
  10. # 2 1 3 2
  11. # 获取第二行并将其转换为列表
  12. second_row = df.iloc[1, :].tolist()
  13. print(second_row)
  14. # [3, 2, 5]
  15. # 定义您要移动到第一位置的列
  16. target_column = "1a"
  17. # 对列表进行排序
  18. sorted_columns = sorted(range(len(second_row)), key=lambda k: second_row[k])
  19. print(sorted_columns)
  20. # 将目标列移动到第一位置
  21. sorted_columns.remove(df.columns.get_loc(target_column))
  22. sorted_columns.insert(0, df.columns.get_loc(target_column))
  23. # 根据排序后的列表重新排列DataFrame的列
  24. df = df.iloc[:, sorted_columns]
  25. print(df)
  26. # 1a 2b 3c
  27. # 0 C B A
  28. # 1 3 2 5
  29. # 2 1 3 2

(注意:代码部分不翻译,只提供翻译好的内容。)

英文:

If you want to move a specific column to the first position, you can modify the code as in example:

  1. import pandas as pd
  2. data = {"1a": ["C", 3, 1],
  3. "2b": ["B", 2, 3],
  4. "3c": ["A", 5, 2]}
  5. df = pd.DataFrame(data)
  6. print(df)
  7. # 1a 2b 3c
  8. # 0 C B A
  9. # 1 3 2 5
  10. # 2 1 3 2
  11. # Get the second row and convert it to a list
  12. second_row = df.iloc[1, :].tolist()
  13. print(second_row)
  14. # [3, 2, 5]
  15. # Define the column you want to move to the first position
  16. target_column = "1a"
  17. # Sort the list
  18. sorted_columns = sorted(range(len(second_row)), key=lambda k: second_row[k])
  19. print(sorted_columns)
  20. # Move the target column to the first position
  21. sorted_columns.remove(df.columns.get_loc(target_column))
  22. sorted_columns.insert(0, df.columns.get_loc(target_column))
  23. # Reorder the columns of the DataFrame based on the sorted list
  24. df = df.iloc[:, sorted_columns]
  25. print(df)
  26. # 1a 2b 3c
  27. # 0 C B A
  28. # 1 3 2 5
  29. # 2 1 3 2

答案3

得分: 0

使用pyjedy和Stingher的帮助,我能够解决这个问题。其中一个问题是由于我的输入引起的。输入是由列表而不是字典组成的,因此我需要进行转换。因此,我对行进行了索引,并在列的顶部进行了索引。因此,从列表中选择元素需要获取索引。

  1. import pandas as pd
  2. def search_list_for_pattern(lst, pattern):
  3. for idx, item in enumerate(lst):
  4. if pattern in item:
  5. break
  6. return idx
  7. data = [['1a', 'B', 2, 3], ['2b', 'C', 3, 1], ['3c', 'A', 5, 2]]
  8. df = pd.DataFrame(data).transpose()
  9. print(df)
  10. # 0 1 2
  11. # 0 1a 2b 3c
  12. # 1 B C A
  13. # 2 2 3 5
  14. # 3 3 1 2
  15. # 获取第二行并将其转换为列表
  16. second_row = df.iloc[1, :].tolist()
  17. print(second_row)
  18. # ['B', 'C', 'A']
  19. # 找到要移动到第一个位置的列的索引
  20. target_column = search_list_for_pattern(second_row, "C")
  21. print(target_column)
  22. # 1
  23. # 对列表进行排序
  24. sorted_columns = sorted(range(len(second_row)), key=lambda k: second_row[k])
  25. print(sorted_columns)
  26. # [2, 0, 1]
  27. # 将目标列移动到第一个位置
  28. sorted_columns.remove(df.columns.get_loc(target_column))
  29. sorted_columns.insert(0, df.columns.get_loc(target_column))
  30. # 根据排序后的列表重新排列DataFrame的列
  31. df = df.iloc[:, sorted_columns]
  32. print(df)
  33. # 1 2 0
  34. # 0 2b 3c 1a
  35. # 1 C A B
  36. # 2 3 5 2
  37. # 3 1 2 3
  38. df.to_excel('ordered.xlsx', sheet_name='Sheet1', index=False, header=False)

[![enter image description here][1]][1]

  1. [1]: https://i.stack.imgur.com/kcgzy.png
  2. <details>
  3. <summary>英文:</summary>
  4. With the help of pyjedy and Stingher, I was able to resolve this issue. One of the problems was due to my input. The input consisted of lists instead of dictionaries, so I needed to transform it. As a result, I had indexes for rows and across the top for columns. Consequently, selecting elements from the list required obtaining the index.

import pandas as pd

def search_list_for_pattern(lst, pattern):
for idx, item in enumerate(lst):
if pattern in item:
break
return idx

data = [['1a', 'B', 2, 3], ['2b', 'C', 3, 1], ['3c', 'A', 5, 2]]
df = pd.DataFrame(data).transpose()
print(df)

0 1 2

0 1a 2b 3c

1 B C A

2 2 3 5

3 3 1 2

Get the second row and convert it to a list

second_row = df.iloc[1, :].tolist()
print(second_row)

['B', 'C', 'A']

Find the index of the column you want to move to the first position

target_column = search_list_for_pattern(second_row, "C")
print(target_column)

1

Sort the list

sorted_columns = sorted(range(len(second_row)), key=lambda k: second_row[k])
print(sorted_columns)

[2, 0, 1]

Move the target column to the first position

sorted_columns.remove(df.columns.get_loc(target_column))
sorted_columns.insert(0, df.columns.get_loc(target_column))

Reorder the columns of the DataFrame based on the sorted list

df = df.iloc[:, sorted_columns]
print(df)

1 2 0

0 2b 3c 1a

1 C A B

2 3 5 2

3 1 2 3

df.to_excel('ordered.xlsx', sheet_name='Sheet1', index=False, header=False)

  1. [![enter image description here][1]][1]
  2. [1]: https://i.stack.imgur.com/kcgzy.png
  3. </details>

huangapple
  • 本文由 发表于 2023年6月2日 02:16:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76384653.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定