2023年2月6日 14:07:19go评论99阅读模式

英文:

Pandas select columns ordered at the beginning and the rest remain unchanged

问题

例如，我有一个包含许多列的数据框，列的数量不确定，例如在10到20之间。

列名如下：

RecordID, price, company, date, feature1, return, some_inf, feature2, feature3, ...

示例数据：

column_names = ["RecordID", "price", "company", "date", "feature1", "return", "some_inf", "feature2", "feature3"]
values = [1, 9.99, "ABC", 20230101, 888, 0.666, "happy_everyday", "helloworld", "test"]
df = pd.DataFrame(values).T
df.columns = column_names

在所有这些列中，我想挑选出一些列（如果它们存在），并将它们放在最前面，其余的列顺序不变。例如，如果我想选择date, volume, price, return，那么输出（带有重新排序的列）将是：

date, price, return, RecordID, company, feature1, some_inf, feature2, feature3, ...

volume 列在原始数据框中不存在，因此它也不应出现在最终输出中。即输出数据框应包含选择列表中的前几列（如果它们也在原始数据框中），然后是不在此列表中的列，顺序不变。

有没有快速实现这个的方法？

英文:

For example, I have dataframe with many columns, with the number of columns not clear, e.g.. between 10 and 20.

The column name in the follows:

RecordID, price, company, date, feature1, return, some_inf, feature2, feature3, ...

Sample data:

column_names = [&quot;RecordID&quot;, &quot;price&quot;, &quot;company&quot;, &quot;date&quot;, &quot;feature1&quot;, &quot;return&quot;, &quot;some_inf&quot;, &quot;feature2&quot;, &quot;feature3&quot;]
values = [1, 9.99, &quot;ABC&quot;, 20230101, 888, 0.666, &quot;happy_everyday&quot;, &quot;helloworld&quot;, &quot;test&quot;]
df = pd.DataFrame(values).T
df.columns = column_names

Among all these columns, I would like to pick out some columns (if they exist) and put them at the beginning, and the rest columns follows with order unchanged. For example, if I want to select date, volume, price, return

Then the output (with re-ordered columns) will be

date, price, return, RecordID, company, feature1, some_inf, feature2, feature3, ...

The volume column does not exist in the original dataframe, so that it should also not be in the final output. I.e. The output dataframe should have the first several column in the selection list (if they also are in the original dataframe), then followed by columns not in this list, with orders unchanged.

Any fast way to implement this?

答案1

得分: 3

Use Index.intersection for all columns for beginning with Index.append by columns from Index.difference:

cols = ['date', 'volume', 'price', 'return']
new = (pd.Index(cols).intersection(df.columns, sort=False)
         .append(df.columns.difference(cols, sort=False)))
df = df[new]
print (df)
       date price return RecordID company feature1        some_inf  \
0  20230101  9.99  0.666        1     ABC      888  happy_everyday   
     feature2 feature3  
0  helloworld     test

英文:

Use Index.intersection for all columns for begining with Index.append by columns from Index.difference:

cols = [&#39;date&#39;, &#39;volume&#39;, &#39;price&#39;, &#39;return&#39;]
new = (pd.Index(cols).intersection(df.columns, sort=False)
         .append(df.columns.difference(cols, sort=False)))
df = df[new]
print (df)
       date price return RecordID company feature1        some_inf  \
0  20230101  9.99  0.666        1     ABC      888  happy_everyday   
     feature2 feature3  
0  helloworld     test

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Pandas选择列按顺序排在前面，其余列保持不变。

问题

答案1

在R中将行（剂量的变化）添加到我的数据框中。

使用Python转换具有“level”列的数据。

beanie.exceptions.CollectionWasNotInitialized 错误

Telegram bot function with event handler not executing before program ends; how to fix and print results properly?

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。