英文:
Sort pandas dataframe columns on second row order
问题
根据第二行的顺序对数据框进行排序,例如:
import pandas as pd
data = {'1a': ['C', 3, 1], '2b': ['B', 2, 3], '3c': ['A', 5, 2]}
df = pd.DataFrame(data)
df
输出结果:
1a 2b 3c
0 C B A
1 3 2 5
2 1 3 2
期望的输出:
3c 2b 1a
0 A B C
1 5 2 3
2 2 3 1
你可以使用以下方法实现这个目标:
# 创建一个用于排序的列表
mylist = ['A', 'B', 'C']
# 按照mylist的顺序重新排列数据框的列
df = df[mylist]
# 输出结果
df
这将按照mylist的顺序重新排列数据框的列,得到期望的输出:
3c 2b 1a
0 A B C
1 5 2 3
2 2 3 1
英文:
I need to sort a dataframe based on the order of the second row. For example:
import pandas as pd
data = {'1a': ['C', 3, 1], '2b': ['B', 2, 3], '3c': ['A', 5, 2]}
df = pd.DataFrame(data)
df
Output:
1a 2b 3c
0 C B A
1 3 2 5
2 1 3 2
Desired output:
3c 2b 1a
0 A B C
1 5 2 3
2 2 3 1
So the columns have been order based on the zero index row, on the A, B, C.
Have tried many sorting options without success.
Having a quick way to accomplish this would be beneficial, but having granular control to both order the elements and move a specific column to the first position would be even better. For example move "C" to the first column.
Something like make a list, sort, move and reorder on list.
mylist = ['B', 'A', 'C']
mylist.sort()
mylist.insert(0, mylist.pop(mylist.index('C')))
Then sorting the dataframe on ['C', 'A', 'B'] outputting
1a 3c 2b
0 C A B
1 3 5 2
2 1 2 3
答案1
得分: 2
你可以尝试使用以下代码:
df = df[df.iloc[0].sort_values().index]
返回结果如下:
3c 2b 1a
0 A B C
1 5 2 3
2 2 3 1
在这种情况下,你正在处理第一行数据,对其进行排序,然后返回排序后的索引值。你可以根据需要对多列/多行进行排序。
英文:
You can try with:
df = df[df.iloc[0].sort_values().index]
Returning:
3c 2b 1a
0 A B C
1 5 2 3
2 2 3 1
In this case you are working with the first row, and sorting it to then return the index values in the sorted. You have a lot flexibility you can even sort by multiple columns/rows this way too.
答案2
得分: 1
如果您想将特定列移动到第一位置,您可以修改代码如下所示:
import pandas as pd
data = {"1a": ["C", 3, 1],
"2b": ["B", 2, 3],
"3c": ["A", 5, 2]}
df = pd.DataFrame(data)
print(df)
# 1a 2b 3c
# 0 C B A
# 1 3 2 5
# 2 1 3 2
# 获取第二行并将其转换为列表
second_row = df.iloc[1, :].tolist()
print(second_row)
# [3, 2, 5]
# 定义您要移动到第一位置的列
target_column = "1a"
# 对列表进行排序
sorted_columns = sorted(range(len(second_row)), key=lambda k: second_row[k])
print(sorted_columns)
# 将目标列移动到第一位置
sorted_columns.remove(df.columns.get_loc(target_column))
sorted_columns.insert(0, df.columns.get_loc(target_column))
# 根据排序后的列表重新排列DataFrame的列
df = df.iloc[:, sorted_columns]
print(df)
# 1a 2b 3c
# 0 C B A
# 1 3 2 5
# 2 1 3 2
(注意:代码部分不翻译,只提供翻译好的内容。)
英文:
If you want to move a specific column to the first position, you can modify the code as in example:
import pandas as pd
data = {"1a": ["C", 3, 1],
"2b": ["B", 2, 3],
"3c": ["A", 5, 2]}
df = pd.DataFrame(data)
print(df)
# 1a 2b 3c
# 0 C B A
# 1 3 2 5
# 2 1 3 2
# Get the second row and convert it to a list
second_row = df.iloc[1, :].tolist()
print(second_row)
# [3, 2, 5]
# Define the column you want to move to the first position
target_column = "1a"
# Sort the list
sorted_columns = sorted(range(len(second_row)), key=lambda k: second_row[k])
print(sorted_columns)
# Move the target column to the first position
sorted_columns.remove(df.columns.get_loc(target_column))
sorted_columns.insert(0, df.columns.get_loc(target_column))
# Reorder the columns of the DataFrame based on the sorted list
df = df.iloc[:, sorted_columns]
print(df)
# 1a 2b 3c
# 0 C B A
# 1 3 2 5
# 2 1 3 2
答案3
得分: 0
使用pyjedy和Stingher的帮助,我能够解决这个问题。其中一个问题是由于我的输入引起的。输入是由列表而不是字典组成的,因此我需要进行转换。因此,我对行进行了索引,并在列的顶部进行了索引。因此,从列表中选择元素需要获取索引。
import pandas as pd
def search_list_for_pattern(lst, pattern):
for idx, item in enumerate(lst):
if pattern in item:
break
return idx
data = [['1a', 'B', 2, 3], ['2b', 'C', 3, 1], ['3c', 'A', 5, 2]]
df = pd.DataFrame(data).transpose()
print(df)
# 0 1 2
# 0 1a 2b 3c
# 1 B C A
# 2 2 3 5
# 3 3 1 2
# 获取第二行并将其转换为列表
second_row = df.iloc[1, :].tolist()
print(second_row)
# ['B', 'C', 'A']
# 找到要移动到第一个位置的列的索引
target_column = search_list_for_pattern(second_row, "C")
print(target_column)
# 1
# 对列表进行排序
sorted_columns = sorted(range(len(second_row)), key=lambda k: second_row[k])
print(sorted_columns)
# [2, 0, 1]
# 将目标列移动到第一个位置
sorted_columns.remove(df.columns.get_loc(target_column))
sorted_columns.insert(0, df.columns.get_loc(target_column))
# 根据排序后的列表重新排列DataFrame的列
df = df.iloc[:, sorted_columns]
print(df)
# 1 2 0
# 0 2b 3c 1a
# 1 C A B
# 2 3 5 2
# 3 1 2 3
df.to_excel('ordered.xlsx', sheet_name='Sheet1', index=False, header=False)
[![enter image description here][1]][1]
[1]: https://i.stack.imgur.com/kcgzy.png
<details>
<summary>英文:</summary>
With the help of pyjedy and Stingher, I was able to resolve this issue. One of the problems was due to my input. The input consisted of lists instead of dictionaries, so I needed to transform it. As a result, I had indexes for rows and across the top for columns. Consequently, selecting elements from the list required obtaining the index.
import pandas as pd
def search_list_for_pattern(lst, pattern):
for idx, item in enumerate(lst):
if pattern in item:
break
return idx
data = [['1a', 'B', 2, 3], ['2b', 'C', 3, 1], ['3c', 'A', 5, 2]]
df = pd.DataFrame(data).transpose()
print(df)
0 1 2
0 1a 2b 3c
1 B C A
2 2 3 5
3 3 1 2
Get the second row and convert it to a list
second_row = df.iloc[1, :].tolist()
print(second_row)
['B', 'C', 'A']
Find the index of the column you want to move to the first position
target_column = search_list_for_pattern(second_row, "C")
print(target_column)
1
Sort the list
sorted_columns = sorted(range(len(second_row)), key=lambda k: second_row[k])
print(sorted_columns)
[2, 0, 1]
Move the target column to the first position
sorted_columns.remove(df.columns.get_loc(target_column))
sorted_columns.insert(0, df.columns.get_loc(target_column))
Reorder the columns of the DataFrame based on the sorted list
df = df.iloc[:, sorted_columns]
print(df)
1 2 0
0 2b 3c 1a
1 C A B
2 3 5 2
3 1 2 3
df.to_excel('ordered.xlsx', sheet_name='Sheet1', index=False, header=False)
[![enter image description here][1]][1]
[1]: https://i.stack.imgur.com/kcgzy.png
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论