问题

I have translated the code portion as requested. Here is the translated code:

import pandas as pd

data = {
    'GenoType': [1, 2, 4],
    'Parameter1': [2, 2, 787],
    'First.Second.Third.Car': ['Honda', 'Tesla', 'Jeep'],
    'First.Second.Third.Food': ['Pizza', 'Fries', 'Grapes'],
    'Red.Orange.Yellow.Car': ['Acura', 'BMW', 'Toyota'],
    'Red.Orange.Yellow.Food': ['Burger', 'Potatoes', 'Wings']
}

df = pd.DataFrame(data)

melted_df = df.melt(id_vars=['GenoType', 'Parameter1'], var_name='Trait', value_name='Value')
melted_df[['Trait', 'Column']] = melted_df['Trait'].str.rsplit('.', n=1, expand=True)

pivoted_df = melted_df.pivot_table(index=['GenoType', 'Parameter1', 'Trait'], columns='Column', values='Value', aggfunc='first')
pivoted_df.reset_index(inplace=True)
pivoted_df.columns.name = ''

# Melt the pivoted_df to obtain the final output
final_df = pivoted_df.melt(id_vars=['GenoType', 'Parameter1', 'Trait'], var_name='Column', value_name='Value')
final_df.sort_values(by=['GenoType', 'Parameter1', 'Trait', 'Column'], inplace=True)
final_df.reset_index(drop=True, inplace=True)

expanded_df = final_df.pivot_table(index=['GenoType', 'Parameter1', 'Trait'], columns='Column', values='Value', aggfunc='first')
expanded_df.reset_index(inplace=True)
expanded_df.columns.name = ''
expanded_df = expanded_df.rename_axis(None, axis=1)

print(expanded_df)

Please note that this code is a translation of the provided code and doesn't include any additional information or explanations.

英文:

I currently have this table

and want to get this output:

The expected logic is i want to strip any columns that have periods in the column name, after the third period. In this case, there are 4, and after stripping the column names we will have 2 unique column names. (Cars and Food)

The additional column "trait" is just the value before the third period. and will be in the expected output.

The expected output will have more rows as it is pivoted. Also assume this case for x column names that should have periods in them.

I am not able to get it yet, but my current code is pasted below.

import pandas as pd

data = {
    &#39;GenoType&#39;: [1, 2, 4],
    &#39;Parameter1&#39;: [2, 2, 787],
    &#39;First.Second.Third.Car&#39;: [&#39;Honda&#39;, &#39;Tesla&#39;, &#39;Jeep&#39;],
    &#39;First.Second.Third.Food&#39;: [&#39;Pizza&#39;, &#39;Fries&#39;, &#39;Grapes&#39;],
    &#39;Red.Orange.Yellow.Car&#39;: [&#39;Acura&#39;, &#39;BMW&#39;, &#39;Toyota&#39;],
    &#39;Red.Orange.Yellow.Food&#39;: [&#39;Burger&#39;, &#39;Potatoes&#39;, &#39;Wings&#39;]
}

df = pd.DataFrame(data)

melted_df = df.melt(id_vars=[&#39;GenoType&#39;, &#39;Parameter1&#39;], var_name=&#39;Trait&#39;, value_name=&#39;Value&#39;)
melted_df[[&#39;Trait&#39;, &#39;Column&#39;]] = melted_df[&#39;Trait&#39;].str.rsplit(&#39;.&#39;, n=1, expand=True)

pivoted_df = melted_df.pivot_table(index=[&#39;GenoType&#39;, &#39;Parameter1&#39;, &#39;Trait&#39;], columns=&#39;Column&#39;, values=&#39;Value&#39;, aggfunc=&#39;first&#39;)
pivoted_df.reset_index(inplace=True)
pivoted_df.columns.name = &#39;&#39;

# Melt the pivoted_df to obtain the final output
final_df = pivoted_df.melt(id_vars=[&#39;GenoType&#39;, &#39;Parameter1&#39;, &#39;Trait&#39;], var_name=&#39;Column&#39;, value_name=&#39;Value&#39;)
final_df.sort_values(by=[&#39;GenoType&#39;, &#39;Parameter1&#39;, &#39;Trait&#39;, &#39;Column&#39;], inplace=True)
final_df.reset_index(drop=True, inplace=True)

expanded_df = final_df.pivot_table(index=[&#39;GenoType&#39;, &#39;Parameter1&#39;, &#39;Trait&#39;], columns=&#39;Column&#39;, values=&#39;Value&#39;, aggfunc=&#39;first&#39;)
expanded_df.reset_index(inplace=True)
expanded_df.columns.name = &#39;&#39;
expanded_df = expanded_df.rename_axis(None, axis=1)

print(expanded_df)



</details>


# 答案1
**得分**: 1

# 创建多级索引，以前两列为基础
df1 = df.set_index(['GenoType', 'Parameter1'])

# 过滤食品列
cols = df1.filter(regex=r'\.Food').columns

# 连接并重命名食品列名称
df1 = pd.concat([df1, df1.rename(columns=dict(zip(cols, cols[::-1])))]

# 创建列的多级索引
df1.columns = df1.columns.str.rsplit('.', n=1, expand=True)

# 重塑
df1 = df1.stack(0).sort_index(level=2).reset_index().rename(columns={'level_2':'Trait'})
print (df1)

    GenoType  Parameter1               Trait     Car      Food
0          1           2  First.Second.Third   Honda     Pizza
1          1           2  First.Second.Third   Honda    Burger
2          2           2  First.Second.Third   Tesla     Fries
3          2           2  First.Second.Third   Tesla  Potatoes
4          4         787  First.Second.Third    Jeep    Grapes
5          4         787  First.Second.Third    Jeep     Wings
6          1           2   Red.Orange.Yellow   Acura    Burger
7          1           2   Red.Orange.Yellow   Acura     Pizza
8          2           2   Red.Orange.Yellow     BMW  Potatoes
9          2           2   Red.Orange.Yellow     BMW     Fries
10         4         787   Red.Orange.Yellow  Toyota     Wings
11         4         787   Red.Orange.Yellow  Toyota    Grapes

<details>
<summary>英文:</summary>

Use:

    #create MulitIndex by first 2 columns
    df1 = df.set_index([&#39;GenoType&#39;, &#39;Parameter1&#39;])
    
    #filter Food columns
    cols = df1.filter(regex=r&#39;\.Food&#39;).columns
    
    #join with rename Food columns names
    df1 = pd.concat([df1, df1.rename(columns=dict(zip(cols, cols[::-1])))])
    
    #create MulitIndex in columns
    df1.columns = df1.columns.str.rsplit(&#39;.&#39;, n=1, expand=True)
    
    #reshape
    df1 = df1.stack(0).sort_index(level=2).reset_index().rename(columns={&#39;level_2&#39;:&#39;Trait&#39;})
    print (df1)
    
        GenoType  Parameter1               Trait     Car      Food
    0          1           2  First.Second.Third   Honda     Pizza
    1          1           2  First.Second.Third   Honda    Burger
    2          2           2  First.Second.Third   Tesla     Fries
    3          2           2  First.Second.Third   Tesla  Potatoes
    4          4         787  First.Second.Third    Jeep    Grapes
    5          4         787  First.Second.Third    Jeep     Wings
    6          1           2   Red.Orange.Yellow   Acura    Burger
    7          1           2   Red.Orange.Yellow   Acura     Pizza
    8          2           2   Red.Orange.Yellow     BMW  Potatoes
    9          2           2   Red.Orange.Yellow     BMW     Fries
    10         4         787   Red.Orange.Yellow  Toyota     Wings
    11         4         787   Red.Orange.Yellow  Toyota    Grapes

</details>



# 答案2
**得分**: 1

这是代码部分，不需要翻译:

```python
data = {
    'GenoType': [1, 2, 4],
    'Parameter1': [2, 2, 787],
    'First.Second.Third.Car': ['Honda', 'Tesla', 'Jeep'],
    'First.Second.Third.Food': ['Pizza', 'Fries', 'Grapes'],
    'Red.Orange.Yellow.Car': ['Acura', 'BMW', 'Toyota'],
    'Red.Orange.Yellow.Food': ['Burger', 'Potatoes', 'Wings']
}

df = pd.DataFrame(data)

traits = df.columns.str.extract(r'^((?:[^.]*\.){2}[^.]*)').dropna().drop_duplicates()[0].tolist()
features = df.columns.str.extract(r'^(?:[^.]*\.){3}(.*)$').dropna().drop_duplicates()[0].tolist()

out = df.melt(value_vars=[f'{t}.{features[0]}' for t in traits], value_name=features[0], id_vars=['GenoType', 'Parameter1'], var_name='Trait')
out['Trait'] = out['Trait'].str.extract(r'^((?:[^.]*\.){2}[^.]*)')
for f in features[1:]:
    dff = df.melt(value_vars=[f'{t}.{f}' for t in traits], value_name=f, id_vars=['GenoType', 'Parameter1'], var_name='Trait')
    out = out.merge(dff, on=['GenoType', 'Parameter1'], suffixes=['', '_y'])
    out.drop('Trait_y', axis=1, inplace=True)
out = out.sort_values('Trait', kind='mergesort').reset_index(drop=True)

希望对你有所帮助！如果有其他需要，请随时告诉我。

英文:

You could solve this using an iterative melt and merge. This will work for an arbitrary number of columns:

data = {
    &#39;GenoType&#39;: [1, 2, 4],
    &#39;Parameter1&#39;: [2, 2, 787],
    &#39;First.Second.Third.Car&#39;: [&#39;Honda&#39;, &#39;Tesla&#39;, &#39;Jeep&#39;],
    &#39;First.Second.Third.Food&#39;: [&#39;Pizza&#39;, &#39;Fries&#39;, &#39;Grapes&#39;],
    &#39;Red.Orange.Yellow.Car&#39;: [&#39;Acura&#39;, &#39;BMW&#39;, &#39;Toyota&#39;],
    &#39;Red.Orange.Yellow.Food&#39;: [&#39;Burger&#39;, &#39;Potatoes&#39;, &#39;Wings&#39;]
}

df = pd.DataFrame(data)

traits = df.columns.str.extract(r&#39;^((?:[^.]*\.){2}[^.]*)&#39;).dropna().drop_duplicates()[0].tolist()
features = df.columns.str.extract(r&#39;^(?:[^.]*\.){3}(.*)$&#39;).dropna().drop_duplicates()[0].tolist()

out = df.melt(value_vars=[f&#39;{t}.{features[0]}&#39; for t in traits], value_name=features[0], id_vars=[&#39;GenoType&#39;, &#39;Parameter1&#39;], var_name=&#39;Trait&#39;)
out[&#39;Trait&#39;] = out[&#39;Trait&#39;].str.extract(r&#39;^((?:[^.]*\.){2}[^.]*)&#39;)
for f in features[1:]:
    dff = df.melt(value_vars=[f&#39;{t}.{f}&#39; for t in traits], value_name=f, id_vars=[&#39;GenoType&#39;, &#39;Parameter1&#39;], var_name=&#39;Trait&#39;)
    out = out.merge(dff, on=[&#39;GenoType&#39;, &#39;Parameter1&#39;], suffixes=[&#39;&#39;, &#39;_y&#39;])
    out.drop(&#39;Trait_y&#39;, axis=1, inplace=True)
out = out.sort_values(&#39;Trait&#39;, kind=&#39;mergesort&#39;).reset_index(drop=True)

Output:

    GenoType  Parameter1               Trait     Car      Food
0          1           2  First.Second.Third   Honda     Pizza
1          1           2  First.Second.Third   Honda    Burger
2          2           2  First.Second.Third   Tesla     Fries
3          2           2  First.Second.Third   Tesla  Potatoes
4          4         787  First.Second.Third    Jeep    Grapes
5          4         787  First.Second.Third    Jeep     Wings
6          1           2   Red.Orange.Yellow   Acura     Pizza
7          1           2   Red.Orange.Yellow   Acura    Burger
8          2           2   Red.Orange.Yellow     BMW     Fries
9          2           2   Red.Orange.Yellow     BMW  Potatoes
10         4         787   Red.Orange.Yellow  Toyota    Grapes
11         4         787   Red.Orange.Yellow  Toyota     Wings

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Pandas – 表格透视

问题

在Django中使用枚举时出现错误ModuleNotFoundError: No module named ‘models’

如何在同一类中的Python中调用另一个函数？

如何使用Django的ImageField将文件上传到Google Cloud Storage而不是本地存储。

机器人在事件上没有移除角色。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论