英文:
Add a column name in incremental manner using pandas
问题
我有一个数据框,其中列是week, year, value, created date和其他列以数字作为列名。
查看附加的图像。除了category
,week
,created_date
和year
列外,我需要所有列以以下方式具有随机列名:col1
,col2
,col3
,使用pandas。
英文:
I have a dataframe where columns are week,year,value,created date othercolumn have numbers as column name .
look at the image attached .Except the category
,week
,created_date
and year
column, I need all columns to have a random column name in this manner: col1
,col2
,col3
using pandas
答案1
得分: 4
以下是翻译好的内容:
假设这个示例:
22 57 64 category week created_date year X
0 x x x x x x x x
你可以使用自定义函数来rename
你的列,借助itertools.count
的帮助:
from itertools import count
keep = {'category', 'week', 'created_date', 'year'}
c = count(1)
out = df.rename(columns=lambda x: x if x in keep else f'col{next(c)}')
输出:
col1 col2 col3 category week created_date year col4
0 x x x x x x x x
或者,如果你只想重命名整数:
from itertools import count
c = count(1)
out = df.rename(columns=lambda x: f'col{next(c)}' if isinstance(x, int) else x)
输出:
col1 col2 col3 category week created_date year X
0 x x x x x x x x
随机名称
你还可以使用UUID(这里使用短UUID以提高清晰度):
# pip install shortuuid
import shortuuid
df.rename(columns=lambda x: shortuuid.uuid() if isinstance(x, int) else x))
示例输出:
QCU9M2rxT4L2U8DGUAixkW fsxdN3aLaVDVjq3HAGCyYW HoGLKTRctpubfwqUp6rmEF
0 x x x
category week created_date year X
0 x x x x x
英文:
Assuming this example:
22 57 64 category week created_date year X
0 x x x x x x x x
You can use a custom function to rename
your columns, with help of itertools.count
:
from itertools import count
keep = {'category', 'week', 'created_date', 'year'}
c = count(1)
out = df.rename(columns=lambda x: x if x in keep else f'col{next(c)}')
Output:
col1 col2 col3 category week created_date year col4
0 x x x x x x x x
Alternatively, if you want to rename the integers only:
from itertools import count
c = count(1)
out = df.rename(columns=lambda x: f'col{next(c)}' if isinstance(x, int) else x)
Output:
col1 col2 col3 category week created_date year X
0 x x x x x x x x
random names
You can also use UUIDs (here short uuids for clarity):
# pip install shortuuid
import shortuuid
df.rename(columns=lambda x: shortuuid.uuid() if isinstance(x, int) else x))
Example output:
QCU9M2rxT4L2U8DGUAixkW fsxdN3aLaVDVjq3HAGCyYW HoGLKTRctpubfwqUp6rmEF
0 x x x \
category week created_date year X
0 x x x x x
答案2
得分: 1
另一种可能的解决方案:
to_exclude = ["category", "week", "created_date", "year"]
to_rename = df.columns.difference(to_exclude) #[22, 57, 64]
df.rename(columns={c:f"col{n}" for (c,n) in zip(to_rename, range(1, to_rename.size+1))})
输出(列):
['col1', 'col2', 'col3', 'category', 'week', 'created_date', 'year']
英文:
Another possible solution :
to_exclude = ["category", "week", "created_date", "year"]
to_rename = df.columns.difference(to_exclude) #[22, 57, 64]
df.rename(columns={c:f"col{n}" for (c,n) in zip(to_rename, range(1, to_rename.size+1))})
Output (columns) :
#['col1', 'col2', 'col3', 'category', 'week', 'created_date', 'year']
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论