英文:
How to reorder columns on a subclassed pandas Dataframe
问题
我想重新排列子类化的pandas数据框中的列。
我从这个问题中了解到,可能有一种更好的方法来不子类化数据框,但我仍然想知道如何处理这个问题。
如果不子类化,我会采用经典的方式来做:
import pandas as pd
data = {'Description':['mydesc'], 'Name':['myname'], 'Symbol':['mysymbol']}
df = pd.DataFrame(data)
df = df[['Symbol', 'Name', 'Description']]
但是在子类化的情况下,保持与经典方式相同的行为不会重新排列列:
import pandas as pd
class SubDataFrame(pd.DataFrame):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self = self._reorder_columns()
def _reorder_columns(self):
first_columns = ['Symbol', 'Name', 'Description']
return self[first_columns + [c for c in self.columns if c not in first_columns]]
data = {'Description':['mydesc'], 'Name':['myname'], 'Symbol':['mysymbol']}
df = SubDataFrame(data)
我相信我的错误在于重新分配self
,这不会产生任何效果。
如何在子类化的数据框上实现列重新排列?
1: https://stackoverflow.com/a/35619846/3010217
英文:
I want to reorder dataframe columns from a subclassed pandas dataframe.
I understood from this question there might be a better way for not subclassing a dataframe, but I'm still wondering how to approach this.
Without subclassing, I would do it in a classic way:
import pandas as pd
data = {'Description':['mydesc'], 'Name':['myname'], 'Symbol':['mysymbol']}
df = pd.DataFrame(data)
df = df[['Symbol', 'Name', 'Description']]
But with subclassing, keeping the same behavior as the classic one doesn't reorder the columns:
import pandas as pd
class SubDataFrame(pd.DataFrame):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self = self._reorder_columns()
def _reorder_columns(self):
first_columns = ['Symbol', 'Name', 'Description']
return self[first_columns + [c for c in self.columns if c not in first_columns]]
data = {'Description':['mydesc'], 'Name':['myname'], 'Symbol':['mysymbol']}
df = SubDataFrame(data)
I believe my mistake is in reassigning self
which doesn't have any effect.
How can I achieve column reordering on the subclassed dataframe?
答案1
得分: 1
Pandas的方法中带有inplace
参数的使用了私有方法_update_inplace
。你可以做同样的事情,但要确保跟进未来Pandas的发展以防此方法发生更改:
import pandas as pd
class SubDataFrame(pd.DataFrame):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._update_inplace(self._reorder_columns())
def _reorder_columns(self):
first_columns = ['Symbol', 'Name', 'Description']
return self[first_columns + [c for c in self.columns if c not in first_columns]]
data = {'Description':['mydesc'], 'Name':['myname'], 'Symbol':['mysymbol']}
df = SubDataFrame(data)
输出:
Symbol Name Description
0 mysymbol myname mydesc
英文:
Pandas methods that have an inplace
parameter use the private method _update_inplace
. You could do the same, but be sure to follow future pandas development in case this method changes:
import pandas as pd
class SubDataFrame(pd.DataFrame):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._update_inplace(self._reorder_columns())
def _reorder_columns(self):
first_columns = ['Symbol', 'Name', 'Description']
return self[first_columns + [c for c in self.columns if c not in first_columns]]
data = {'Description':['mydesc'], 'Name':['myname'], 'Symbol':['mysymbol']}
df = SubDataFrame(data)
Output:
Symbol Name Description
0 mysymbol myname mydesc
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论