英文:
How to reshape (pivot_wider and pivot_longer) a pandas DataFame
问题
I'm struggling in reshaping DataFrame (a little complicated) using pandas.
我在使用pandas重塑DataFrame方面遇到了一些困难。
I have tried using pd.melt
, .pivot(index=, columns=, values=)
, but it doesn't work perfectly as I intended.
我尝试过使用pd.melt
和.pivot(index=, columns=, values=)
,但结果并不完全符合我的意图。
as is :
目前的数据如下:
date | location | char1 | char2 |
---|---|---|---|
22-01 | A | a | x |
22-01 | B | b | y |
22-01 | C | c | z |
to be :
期望的数据如下:
date | char | location.A | location.B | location.C |
---|---|---|---|---|
22-01 | char1 | a | b | c |
22-01 | char2 | x | y | z |
英文:
I'm struggling in reshaping DataFrame (a little complicated) using pandas.
I have tried using pd.melt
, .pivot(index=, columns=, values=)
, but it doesn't work perfectly as I intended.
as is :
date | location | char1 | char2 |
---|---|---|---|
22-01 | A | a | x |
22-01 | B | b | y |
22-01 | C | c | z |
to be :
date | char | location.A | location.B | location.C |
---|---|---|---|---|
22-01 | char1 | a | b | c |
22-01 | char2 | x | y | z |
答案1
得分: 2
使用DataFrame.melt
之后再使用DataFrame.pivot
:
df1 = (df.melt(['date','location'], var_name='char')
.pivot(index=['date','char'], columns='location', values='value')
.add_prefix('location.')
.reset_index()
.rename_axis(None, axis=1))
print (df1)
date char location.A location.B location.C
0 22-01 char1 a b c
1 22-01 char2 x y z
或者使用DataFrame.set_index
,结合DataFrame.stack
和Series.unstack
:
df1 = (df.set_index(['date','location'])
.rename_axis('char', axis=1)
.stack()
.unstack(level=1)
.add_prefix('location.')
.reset_index()
.rename_axis(None, axis=1)
)
print (df1)
date char location.A location.B location.C
0 22-01 char1 a b c
1 22-01 char2 x y z
英文:
Use DataFrame.melt
before DataFrame.pivot
:
df1 = (df.melt(['date','location'], var_name='char')
.pivot(index=['date','char'], columns='location', values='value')
.add_prefix('location.')
.reset_index()
.rename_axis(None, axis=1))
print (df1)
date char location.A location.B location.C
0 22-01 char1 a b c
1 22-01 char2 x y z
Or DataFrame.set_index
with DataFrame.stack
and Series.unstack
:
df1 = (df.set_index(['date','location'])
.rename_axis('char', axis=1)
.stack()
.unstack(level=1)
.add_prefix('location.')
.reset_index()
.rename_axis(None, axis=1)
)
print (df1)
date char location.A location.B location.C
0 22-01 char1 a b c
1 22-01 char2 x y z
答案2
得分: 1
你可以使用 janitor
的 pivot_wider
和 pivot_longer
进行操作:
# pip install janitor
import janitor
(df.pivot_wider(index='date', names_from='location',
names_glue="{_value}_location.{location}")
.pivot_longer(index='date', names_to=('char', '.value'), names_sep='_')
)
输出结果:
date char location.A location.B location.C
0 22-01 char1 a b c
1 22-01 char2 x y z
使用纯 pandas,你可以在 set_index
和 stack
之间使用 transpose
(T
) 进行操作:
out = (df.set_index(['location', 'date']).T.rename_axis('char').stack()
.add_prefix('location.').reset_index().rename_axis(columns=None)
)
输出结果:
char date location.A location.B location.C
0 char1 22-01 a b c
1 char2 22-01 x y z
英文:
You can use janitor
with pivot_wider
and pivot_longer
:
# pip install janitor
import janitor
(df.pivot_wider(index='date', names_from='location',
names_glue = "{_value}_location.{location}")
.pivot_longer(index='date', names_to=('char', '.value'), names_sep='_')
)
Output:
date char location.A location.B location.C
0 22-01 char1 a b c
1 22-01 char2 x y z
With pure pandas, you can use a transpose
(T
) in between set_index
and stack
:
out = (df.set_index(['location', 'date']).T.rename_axis('char').stack()
.add_prefix('location.').reset_index().rename_axis(columns=None)
)
Output:
char date location.A location.B location.C
0 char1 22-01 a b c
1 char2 22-01 x y z
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论