Manipulating a given DataFrame in order to recreate it in a different structure, Pandas Python

huangapple go评论52阅读模式
英文:

Manipulating a given DataFrame in order to recreate it in a different structure, Pandas Python

问题

Left DataFrame is the given one, and i want to recreate it to the right DataFrame.

嗨,
假设我有一个给定的DataFrame(左侧),我想创建一个新的DataFrame(右侧)。
我使用右侧DataFrame的索引和列创建了新的DataFrame,现在我想要“填充”单元格。
有什么想法可以以最简单的方式,优先使用向量化方法来完成吗?
在此先行致谢!

我用循环而不是“传统”的方式做了。我想用apply方法或其他智能解决方案来编写它。

编辑:
这是我尝试过的一种方法:

这是填充数据的原始DataFrame1

这是我想要达到的DataFrame2

这是我试图采取的一步,以达到解决方案3

输出应该如下所示(也与Scot的回答相关):
4

提前致谢!

英文:

Left DataFrame is the given one, and i want to recreate it to the right DataFrame.

Hi,
So suppose i have a given DataFrame (the left one), and i want to create a new dataframe (the right one).
I created the new DataFrame with the indexes and columns of the right one, and now i want to "fill" the cells.
Any ideas how can i do it simplest, with priority of vectorize way?
Thank you in advance!

I did it with loops not "classic" way. I would like to write it with apply method or another smart solution.

edit:
this is an approcah that i tried:

This is the original DataFrame with filled data 1

This is the DataFrame i want to reach 2

This is a step i tried to do in order to reach the solution3

the output should be (also in context for Scot's answer):
in 4

Thank you in advance!

答案1

得分: 0

以下是您提供的代码部分的翻译:

我对您的问题有点困惑但我会尝试回答

import pandas as pd
import numpy as np

df = pd.DataFrame(data=np.arange(1, 76).reshape(-1, 5, order='F'),
                  index=pd.MultiIndex.from_product([[1, 2, 3, 4, 5], [*'XYZ']]),
                  columns='Shirt Bottomware Shoes Sunglasses Earrings'.split())
df = df.rename_axis(['Number', 'Variable'])
df

输入数据框

                 Shirt  Bottomware  Shoes  Sunglasses  Earrings
Number Variable                                                
1      X             1          16     31          46        61
       Y             2          17     32          47        62
       Z             3          18     33          48        63
2      X             4          19     34          49        64
       Y             5          20     35          50        65
       Z             6          21     36          51        66
3      X             7          22     37          52        67
       Y             8          23     38          53        68
       Z             9          24     39          54        69
4      X            10          25     40          55        70
       Y            11          26     41          56        71
       Z            12          27     42          57        72
5      X            13          28     43          58        73
       Y            14          29     44          59        74
       Z            15          30     45          60        75

重塑和筛选

df_out = df.unstack().stack(0, dropna=False).loc[[(1, 'Shirt'), (3, 'Shoes'), (5, 'Earrings')]]
df_out

输出数据框

Variable          X   Y   Z
Number                     
1      Shirt      1   2   3
3      Shoes     37  38  39
5      Earrings  73  74  75

希望这有帮助!

英文:

I am little confused by your question, but I will attempt an answer:

import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.arange(1, 76).reshape(-1,5, order='F'), 
index=pd.MultiIndex.from_product([[1,2,3,4,5],[*'XYZ']]), 
columns='Shirt Bottomware Shoes Sunglasses Earrings'.split())    
df = df.rename_axis(['Number', 'Variable'])
df

Input Dataframe:

                 Shirt  Bottomware  Shoes  Sunglasses  Earrings
Number Variable                                                
1      X             1          16     31          46        61
Y             2          17     32          47        62
Z             3          18     33          48        63
2      X             4          19     34          49        64
Y             5          20     35          50        65
Z             6          21     36          51        66
3      X             7          22     37          52        67
Y             8          23     38          53        68
Z             9          24     39          54        69
4      X            10          25     40          55        70
Y            11          26     41          56        71
Z            12          27     42          57        72
5      X            13          28     43          58        73
Y            14          29     44          59        74
Z            15          30     45          60        75

Reshape and filter:

df_out = df.unstack().stack(0, dropna=False).loc[[(1,'Shirt'),(3,'Shoes'),(5,'Earrings')]]    
df_out

Output dataframe:

Variable          X   Y   Z
Number                     
1      Shirt      1   2   3
3      Shoes     37  38  39
5      Earrings  73  74  75

huangapple
  • 本文由 发表于 2023年5月11日 18:33:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/76226676.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定