根据变量名称中相同的后缀,在循环中合并多个数据框。

huangapple go评论98阅读模式
英文:

Merging multiple dataframes in loop based on same suffix in variable names

问题

我想要将demand_dataframe_list中的DataFrame与supply_dataframe_list中的DataFrame合并,当后缀相同时。
例如,data_Market1应该与df_supply2_Market1合并,data_Market2应该与df_supply2_Market2合并。

在这里,应该使用Market1和Market2后缀来基于每个DataFrame中共有的列('Col1'和'Col2')来获取合并后的数据。

以下是我的尝试,但我得到了空的DataFrame。感谢您的帮助!

  1. merged_dataframes = []
  2. for demand_df, supply_df in zip(demand_dataframe_list, supply_dataframe_list):
  3. print(demand_df)
  4. demand_suffix = demand_df.name.split('_')[-1] # 从demand DataFrame名称中提取后缀
  5. supply_suffix = supply_df.name.split('_')[-1] # 从supply DataFrame名称中提取后缀
  6. merged_df = pd.merge(demand_df, supply_df, how="inner", on=['Col1', 'Col2'])
  7. merged_dataframes.append(merged_df)
英文:

I want to merge dataframes from demand_dataframe_list with supply_dataframe_list when the suffix is identical.

  1. demand_dataframe_list = [data_Market1, data_Market2]
  2. supply_dataframe_list = [df_supply2_Market1, df_supply2_Market2]

For example, data_Market1 should be merged with df_supply2_Market1 and data_Market2 should be merged with df_supply2_Market2.

Here Market1 and Market2 suffix should be used to get the merged data based on common columns present in each dataframes which is 'Col1' and 'Col2'.

Below is my try
I am getting the empty dataframe using the code help. Appreciate your help !

  1. merged_dataframes = []
  2. for demand_df, supply_df in zip(demand_dataframe_list, supply_dataframe_list):
  3. print(demand_df)
  4. demand_suffix = demand_df.name.split('_')[-1] # Extract the suffix from the demand dataframe name
  5. supply_suffix = supply_df.name.split('_')[-1] # Extract the suffix from the supply dataframe name
  6. merged_df = pd.merge(demand_df, supply_df, how="inner", on=['Col1', 'Col2'])
  7. merged_dataframes.append(merged_df)

答案1

得分: 1

除非在不同的数据框上先前已经设置了 name 属性,否则获取它将引发异常。

以下辅助函数提供了一种更稳健的方法来获取变量名称的后缀:

  1. def get_suffix(df):
  2. return [x for x in globals() if globals()[x] is df][0].split("_")[-1]

然后,您可以通过将 zip 替换为 Python 标准库的 itertools 模块中的 product 来对两个列表进行更广泛的比较,并通过 list comprehension 使您的代码更易读:

  1. merged_dataframes = [
  2. pd.merge(demand_df, supply_df, how="inner", on=["Col1", "Col2"])
  3. for demand_df, supply_df in product(demand_dataframe_list, supply_dataframe_list)
  4. if get_suffix(demand_df) == get_suffix(supply_df)
  5. ]
英文:

Unless name attribute has previously been set on the different dataframes, getting it will raise an exception.

The following helper function provides a more robust way to get the suffix of the variable names:

  1. def get_suffix(df):
  2. return [x for x in globals() if globals()[x] is df][0].split("_")[-1]

Then, you can do a more extensive comparison of both lists by replacing zip with product from Python standard library's itertools module and make your code more readable with a list comprehension:

  1. merged_dataframes = [
  2. pd.merge(demand_df, supply_df, how="inner", on=["Col1", "Col2"])
  3. for demand_df, supply_df in product(demand_dataframe_list, supply_dataframe_list)
  4. if get_suffix(demand_df) == get_suffix(supply_df)
  5. ]

huangapple
  • 本文由 发表于 2023年6月1日 09:41:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76378181.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定