如何重塑(使用pivot_wider和pivot_longer)pandas DataFrame

huangapple go评论65阅读模式
英文:

How to reshape (pivot_wider and pivot_longer) a pandas DataFame

问题

I'm struggling in reshaping DataFrame (a little complicated) using pandas.

我在使用pandas重塑DataFrame方面遇到了一些困难。

I have tried using pd.melt, .pivot(index=, columns=, values=), but it doesn't work perfectly as I intended.

我尝试过使用pd.melt.pivot(index=, columns=, values=),但结果并不完全符合我的意图。

as is :

目前的数据如下:

date location char1 char2
22-01 A a x
22-01 B b y
22-01 C c z

to be :

期望的数据如下:

date char location.A location.B location.C
22-01 char1 a b c
22-01 char2 x y z
英文:

I'm struggling in reshaping DataFrame (a little complicated) using pandas.

I have tried using pd.melt, .pivot(index=, columns=, values=), but it doesn't work perfectly as I intended.

as is :

date location char1 char2
22-01 A a x
22-01 B b y
22-01 C c z

to be :

date char location.A location.B location.C
22-01 char1 a b c
22-01 char2 x y z

答案1

得分: 2

使用DataFrame.melt之后再使用DataFrame.pivot

df1 = (df.melt(['date','location'], var_name='char')
        .pivot(index=['date','char'], columns='location', values='value')
        .add_prefix('location.')
        .reset_index()
        .rename_axis(None, axis=1))
print (df1)
    date   char location.A location.B location.C
0  22-01  char1          a          b          c
1  22-01  char2          x          y          z

或者使用DataFrame.set_index,结合DataFrame.stackSeries.unstack

df1 = (df.set_index(['date','location'])
         .rename_axis('char', axis=1)
         .stack()
         .unstack(level=1)
         .add_prefix('location.')
         .reset_index()
         .rename_axis(None, axis=1)
       )
print (df1)
    date   char location.A location.B location.C
0  22-01  char1          a          b          c
1  22-01  char2          x          y          z
英文:

Use DataFrame.melt before DataFrame.pivot:

df1 = (df.melt(['date','location'], var_name='char')
        .pivot(index=['date','char'], columns='location', values='value')
        .add_prefix('location.')
        .reset_index()
        .rename_axis(None, axis=1))
print (df1)
    date   char location.A location.B location.C
0  22-01  char1          a          b          c
1  22-01  char2          x          y          z

Or DataFrame.set_index with DataFrame.stack and Series.unstack:

df1 = (df.set_index(['date','location'])
         .rename_axis('char', axis=1)
         .stack()
         .unstack(level=1)
         .add_prefix('location.')
         .reset_index()
         .rename_axis(None, axis=1)
       )
print (df1)
    date   char location.A location.B location.C
0  22-01  char1          a          b          c
1  22-01  char2          x          y          z

答案2

得分: 1

你可以使用 janitorpivot_widerpivot_longer 进行操作:

# pip install janitor
import janitor

(df.pivot_wider(index='date', names_from='location',
                names_glue="{_value}_location.{location}")
   .pivot_longer(index='date', names_to=('char', '.value'), names_sep='_')
)

输出结果:

    date   char location.A location.B location.C
0  22-01  char1          a          b          c
1  22-01  char2          x          y          z

使用纯 pandas,你可以在 set_indexstack 之间使用 transpose (T) 进行操作:

out = (df.set_index(['location', 'date']).T.rename_axis('char').stack()
         .add_prefix('location.').reset_index().rename_axis(columns=None)
      )

输出结果:

    char   date location.A location.B location.C
0  char1  22-01          a          b          c
1  char2  22-01          x          y          z
英文:

You can use janitor with pivot_wider and pivot_longer:

# pip install janitor
import janitor

(df.pivot_wider(index='date', names_from='location',
                names_glue = "{_value}_location.{location}")
   .pivot_longer(index='date', names_to=('char', '.value'), names_sep='_')
)

Output:

    date   char location.A location.B location.C
0  22-01  char1          a          b          c
1  22-01  char2          x          y          z

With pure pandas, you can use a transpose (T) in between set_index and stack:

out = (df.set_index(['location', 'date']).T.rename_axis('char').stack()
         .add_prefix('location.').reset_index().rename_axis(columns=None)
      )

Output:

    char   date location.A location.B location.C
0  char1  22-01          a          b          c
1  char2  22-01          x          y          z

huangapple
  • 本文由 发表于 2023年2月10日 15:50:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/75408246.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定