2023年5月11日 18:33:36go评论56阅读模式

英文:

Manipulating a given DataFrame in order to recreate it in a different structure, Pandas Python

问题

Left DataFrame is the given one, and i want to recreate it to the right DataFrame.

嗨，
假设我有一个给定的DataFrame（左侧），我想创建一个新的DataFrame（右侧）。
我使用右侧DataFrame的索引和列创建了新的DataFrame，现在我想要“填充”单元格。
有什么想法可以以最简单的方式，优先使用向量化方法来完成吗？
在此先行致谢！

我用循环而不是“传统”的方式做了。我想用apply方法或其他智能解决方案来编写它。

编辑：
这是我尝试过的一种方法：

这是填充数据的原始DataFrame1

这是我想要达到的DataFrame2

这是我试图采取的一步，以达到解决方案3

输出应该如下所示（也与Scot的回答相关）：
在4

提前致谢！

英文:

Left DataFrame is the given one, and i want to recreate it to the right DataFrame.

Hi,
So suppose i have a given DataFrame (the left one), and i want to create a new dataframe (the right one).
I created the new DataFrame with the indexes and columns of the right one, and now i want to "fill" the cells.
Any ideas how can i do it simplest, with priority of vectorize way?
Thank you in advance!

I did it with loops not "classic" way. I would like to write it with apply method or another smart solution.

edit:
this is an approcah that i tried:

This is the original DataFrame with filled data 1

This is the DataFrame i want to reach 2

This is a step i tried to do in order to reach the solution3

the output should be (also in context for Scot's answer):
in 4

Thank you in advance!

答案1

得分: 0

以下是您提供的代码部分的翻译：

我对您的问题有点困惑，但我会尝试回答：

import pandas as pd
import numpy as np

df = pd.DataFrame(data=np.arange(1, 76).reshape(-1, 5, order='F'),
                  index=pd.MultiIndex.from_product([[1, 2, 3, 4, 5], [*'XYZ']]),
                  columns='Shirt Bottomware Shoes Sunglasses Earrings'.split())
df = df.rename_axis(['Number', 'Variable'])
df

输入数据框：

                 Shirt  Bottomware  Shoes  Sunglasses  Earrings
Number Variable                                                
1      X             1          16     31          46        61
       Y             2          17     32          47        62
       Z             3          18     33          48        63
2      X             4          19     34          49        64
       Y             5          20     35          50        65
       Z             6          21     36          51        66
3      X             7          22     37          52        67
       Y             8          23     38          53        68
       Z             9          24     39          54        69
4      X            10          25     40          55        70
       Y            11          26     41          56        71
       Z            12          27     42          57        72
5      X            13          28     43          58        73
       Y            14          29     44          59        74
       Z            15          30     45          60        75

重塑和筛选：

df_out = df.unstack().stack(0, dropna=False).loc[[(1, 'Shirt'), (3, 'Shoes'), (5, 'Earrings')]]
df_out

输出数据框：

Variable          X   Y   Z
Number                     
1      Shirt      1   2   3
3      Shoes     37  38  39
5      Earrings  73  74  75

希望这有帮助！

英文:

I am little confused by your question, but I will attempt an answer:

import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.arange(1, 76).reshape(-1,5, order=&#39;F&#39;), 
index=pd.MultiIndex.from_product([[1,2,3,4,5],[*&#39;XYZ&#39;]]), 
columns=&#39;Shirt Bottomware Shoes Sunglasses Earrings&#39;.split())    
df = df.rename_axis([&#39;Number&#39;, &#39;Variable&#39;])
df

Input Dataframe:

                 Shirt  Bottomware  Shoes  Sunglasses  Earrings
Number Variable                                                
1      X             1          16     31          46        61
Y             2          17     32          47        62
Z             3          18     33          48        63
2      X             4          19     34          49        64
Y             5          20     35          50        65
Z             6          21     36          51        66
3      X             7          22     37          52        67
Y             8          23     38          53        68
Z             9          24     39          54        69
4      X            10          25     40          55        70
Y            11          26     41          56        71
Z            12          27     42          57        72
5      X            13          28     43          58        73
Y            14          29     44          59        74
Z            15          30     45          60        75

Reshape and filter:

df_out = df.unstack().stack(0, dropna=False).loc[[(1,&#39;Shirt&#39;),(3,&#39;Shoes&#39;),(5,&#39;Earrings&#39;)]]    
df_out

Output dataframe:

Variable          X   Y   Z
Number                     
1      Shirt      1   2   3
3      Shoes     37  38  39
5      Earrings  73  74  75

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Manipulating a given DataFrame in order to recreate it in a different structure, Pandas Python

问题

答案1

Tracing source of error in difflib due to very different string comparison

如何将pandas DataFrame转换为稀疏DataFrame

pandas 频率表与缺失值

Organizing latitude and longitude into separate columns using Pandas and Geopy to geocode a list of addresses

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论