英文:
How to reshape (pivot_wider and pivot_longer) a pandas DataFame
问题
I'm struggling in reshaping DataFrame (a little complicated) using pandas.
我在使用pandas重塑DataFrame方面遇到了一些困难。
I have tried using pd.melt, .pivot(index=, columns=, values=), but it doesn't work perfectly as I intended.
我尝试过使用pd.melt和.pivot(index=, columns=, values=),但结果并不完全符合我的意图。
as is :
目前的数据如下:
| date | location | char1 | char2 |
|---|---|---|---|
| 22-01 | A | a | x |
| 22-01 | B | b | y |
| 22-01 | C | c | z |
to be :
期望的数据如下:
| date | char | location.A | location.B | location.C |
|---|---|---|---|---|
| 22-01 | char1 | a | b | c |
| 22-01 | char2 | x | y | z |
英文:
I'm struggling in reshaping DataFrame (a little complicated) using pandas.
I have tried using pd.melt, .pivot(index=, columns=, values=), but it doesn't work perfectly as I intended.
as is :
| date | location | char1 | char2 |
|---|---|---|---|
| 22-01 | A | a | x |
| 22-01 | B | b | y |
| 22-01 | C | c | z |
to be :
| date | char | location.A | location.B | location.C |
|---|---|---|---|---|
| 22-01 | char1 | a | b | c |
| 22-01 | char2 | x | y | z |
答案1
得分: 2
使用DataFrame.melt之后再使用DataFrame.pivot:
df1 = (df.melt(['date','location'], var_name='char')
.pivot(index=['date','char'], columns='location', values='value')
.add_prefix('location.')
.reset_index()
.rename_axis(None, axis=1))
print (df1)
date char location.A location.B location.C
0 22-01 char1 a b c
1 22-01 char2 x y z
或者使用DataFrame.set_index,结合DataFrame.stack和Series.unstack:
df1 = (df.set_index(['date','location'])
.rename_axis('char', axis=1)
.stack()
.unstack(level=1)
.add_prefix('location.')
.reset_index()
.rename_axis(None, axis=1)
)
print (df1)
date char location.A location.B location.C
0 22-01 char1 a b c
1 22-01 char2 x y z
英文:
Use DataFrame.melt before DataFrame.pivot:
df1 = (df.melt(['date','location'], var_name='char')
.pivot(index=['date','char'], columns='location', values='value')
.add_prefix('location.')
.reset_index()
.rename_axis(None, axis=1))
print (df1)
date char location.A location.B location.C
0 22-01 char1 a b c
1 22-01 char2 x y z
Or DataFrame.set_index with DataFrame.stack and Series.unstack:
df1 = (df.set_index(['date','location'])
.rename_axis('char', axis=1)
.stack()
.unstack(level=1)
.add_prefix('location.')
.reset_index()
.rename_axis(None, axis=1)
)
print (df1)
date char location.A location.B location.C
0 22-01 char1 a b c
1 22-01 char2 x y z
答案2
得分: 1
你可以使用 janitor 的 pivot_wider 和 pivot_longer 进行操作:
# pip install janitor
import janitor
(df.pivot_wider(index='date', names_from='location',
names_glue="{_value}_location.{location}")
.pivot_longer(index='date', names_to=('char', '.value'), names_sep='_')
)
输出结果:
date char location.A location.B location.C
0 22-01 char1 a b c
1 22-01 char2 x y z
使用纯 pandas,你可以在 set_index 和 stack 之间使用 transpose (T) 进行操作:
out = (df.set_index(['location', 'date']).T.rename_axis('char').stack()
.add_prefix('location.').reset_index().rename_axis(columns=None)
)
输出结果:
char date location.A location.B location.C
0 char1 22-01 a b c
1 char2 22-01 x y z
英文:
You can use janitor with pivot_wider and pivot_longer:
# pip install janitor
import janitor
(df.pivot_wider(index='date', names_from='location',
names_glue = "{_value}_location.{location}")
.pivot_longer(index='date', names_to=('char', '.value'), names_sep='_')
)
Output:
date char location.A location.B location.C
0 22-01 char1 a b c
1 22-01 char2 x y z
With pure pandas, you can use a transpose (T) in between set_index and stack:
out = (df.set_index(['location', 'date']).T.rename_axis('char').stack()
.add_prefix('location.').reset_index().rename_axis(columns=None)
)
Output:
char date location.A location.B location.C
0 char1 22-01 a b c
1 char2 22-01 x y z
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论