2023年6月1日 09:41:33go评论98阅读模式

英文:

Merging multiple dataframes in loop based on same suffix in variable names

问题

我想要将demand_dataframe_list中的DataFrame与supply_dataframe_list中的DataFrame合并，当后缀相同时。
例如，data_Market1应该与df_supply2_Market1合并，data_Market2应该与df_supply2_Market2合并。

在这里，应该使用Market1和Market2后缀来基于每个DataFrame中共有的列（'Col1'和'Col2'）来获取合并后的数据。

以下是我的尝试，但我得到了空的DataFrame。感谢您的帮助！

merged_dataframes = []
for demand_df, supply_df in zip(demand_dataframe_list, supply_dataframe_list):
    print(demand_df)
    demand_suffix = demand_df.name.split('_')[-1]  # 从demand DataFrame名称中提取后缀
    supply_suffix = supply_df.name.split('_')[-1]  # 从supply DataFrame名称中提取后缀
    merged_df = pd.merge(demand_df, supply_df, how="inner", on=['Col1', 'Col2'])
    merged_dataframes.append(merged_df)

英文:

I want to merge dataframes from demand_dataframe_list with supply_dataframe_list when the suffix is identical.

demand_dataframe_list = [data_Market1, data_Market2] 
supply_dataframe_list = [df_supply2_Market1, df_supply2_Market2]

For example, data_Market1 should be merged with df_supply2_Market1 and data_Market2 should be merged with df_supply2_Market2.

Here Market1 and Market2 suffix should be used to get the merged data based on common columns present in each dataframes which is 'Col1' and 'Col2'.

Below is my try
I am getting the empty dataframe using the code help. Appreciate your help !

merged_dataframes = []
for demand_df, supply_df in zip(demand_dataframe_list, supply_dataframe_list):
    print(demand_df)
    demand_suffix = demand_df.name.split(&#39;_&#39;)[-1]  # Extract the suffix from the demand dataframe name
    supply_suffix = supply_df.name.split(&#39;_&#39;)[-1]  # Extract the suffix from the supply dataframe name
    merged_df = pd.merge(demand_df, supply_df, how=&quot;inner&quot;, on=[&#39;Col1&#39;, &#39;Col2&#39;])
    merged_dataframes.append(merged_df)

答案1

得分: 1

除非在不同的数据框上先前已经设置了 name 属性，否则获取它将引发异常。

以下辅助函数提供了一种更稳健的方法来获取变量名称的后缀：

def get_suffix(df):
    return [x for x in globals() if globals()[x] is df][0].split("_")[-1]

然后，您可以通过将 zip 替换为 Python 标准库的 itertools 模块中的 product 来对两个列表进行更广泛的比较，并通过 list comprehension 使您的代码更易读：

merged_dataframes = [
    pd.merge(demand_df, supply_df, how="inner", on=["Col1", "Col2"])
    for demand_df, supply_df in product(demand_dataframe_list, supply_dataframe_list)
    if get_suffix(demand_df) == get_suffix(supply_df)
]

英文:

Unless name attribute has previously been set on the different dataframes, getting it will raise an exception.

The following helper function provides a more robust way to get the suffix of the variable names:

def get_suffix(df):
    return [x for x in globals() if globals()[x] is df][0].split(&quot;_&quot;)[-1]

Then, you can do a more extensive comparison of both lists by replacing zip with product from Python standard library's itertools module and make your code more readable with a list comprehension:

merged_dataframes = [
    pd.merge(demand_df, supply_df, how=&quot;inner&quot;, on=[&quot;Col1&quot;, &quot;Col2&quot;])
    for demand_df, supply_df in product(demand_dataframe_list, supply_dataframe_list)
    if get_suffix(demand_df) == get_suffix(supply_df)
]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

根据变量名称中相同的后缀，在循环中合并多个数据框。

问题

答案1

“fill_between”未达到指定的X位置。

Udemy课程代码在Sublime中无法运行。我做错了什么？

State_dict() 出现了意外的关键字参数 ‘destination’。

Slack API 中的某些部分不起作用。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。