更新数据框的一个切片,根据项目编号、因子和周数,使用另一个数据框。

huangapple go评论63阅读模式
英文:

updating a slice of dataframe from another dataframe accoding to the item number, Factor, and week number

问题

你可以使用merge函数来合并这两个数据框,并且指定哪些列用于合并。然后,你可以使用fillna函数来填充缺失的值为0。下面是一个示例代码:

# 合并两个数据框,指定合并列
merged_df = df1.merge(df2, on=['Item Number', 'Model', 'Factor'], how='left')

# 填充缺失的值为0
merged_df.fillna(0, inplace=True)

# 选择需要的列,即日期列和原始df1的列
date_columns = df1.columns[3:]
result_df = merged_df[['Item Number', 'Model', 'Factor'] + list(date_columns)]

# 打印结果
print(result_df)

这将给你一个合并后的数据框,其中包含了更新后的数值,缺失的值被填充为0。希望这对你有所帮助!

英文:

Suppose you have dataframes look like that:

df1:

Item Number Model Factor 2/4/22 2/11/22 2/18/22 2/25/22 3/4/22 3/11/22 3/18/22 3/25/22 4/1/22 4/8/22 4/15/22 4/22/22 4/29/22
123 A Sell out forecast 0 0 0 0 0 0 0 0 0 0 0 0 0
123 A Inventory 0 0 0 0 0 0 0 0 0 0 0 0 0
123 A Incoming 0 0 0 0 0 0 0 0 0 0 0 0 0
345 B Sell out forecast 0 0 0 0 0 0 0 0 0 0 0 0 0
345 B Inventory 0 0 0 0 0 0 0 0 0 0 0 0 0
345 B Incoming 0 0 0 0 0 0 0 0 0 0 0 0 0
678 C Sell out forecast 0 0 0 0 0 0 0 0 0 0 0 0 0
678 C Inventory 0 0 0 0 0 0 0 0 0 0 0 0 0
678 C Incoming 0 0 0 0 0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
y-2 Z Sell out forecast 0 0 0 0 0 0 0 0 0 0 0 0 0
y-1 Z Inventory 0 0 0 0 0 0 0 0 0 0 0 0 0
y Z Incoming 0 0 0 0 0 0 0 0 0 0 0 0 0

df2:

Item Number Model Factor 1/7/22 1/14/22 1/21/22 1/28/22 2/4/22 2/11/22 2/18/22 2/25/22 3/4/22 3/11/22 3/18/22 3/25/22 4/1/22
123 A Sells Forecast 1 2 3 4 5 6 7 8 9 10 11 12 13
345 B Sells Forecast 2 2 2 2 2 2 2 2 2 2 2 2 2

How can I updating the df1 data with df2 data, under the right factor row, model name and only apply to their common week (without changing other week's and model's number)?

My expected output is like:

Item Number Model Factor 2/4/22 2/11/22 2/18/22 2/25/22 3/4/22 3/11/22 3/18/22 3/25/22 4/1/22 4/8/22 4/15/22 4/22/22 4/29/22
123 A Sell out forecast 5 6 7 8 9 10 11 12 13 0 0 0 0
123 A Inventory 0 0 0 0 0 0 0 0 0 0 0 0 0
123 A Incoming 0 0 0 0 0 0 0 0 0 0 0 0 0
345 B Sell out forecast 2 2 2 2 2 2 2 2 2 0 0 0 0
345 B Inventory 0 0 0 0 0 0 0 0 0 0 0 0 0
345 B Incoming 0 0 0 0 0 0 0 0 0 0 0 0 0
678 C Sell out forecast 0 0 0 0 0 0 0 0 0 0 0 0 0
678 C Inventory 0 0 0 0 0 0 0 0 0 0 0 0 0
678 C Incoming 0 0 0 0 0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
y-2 Z Sell out forecast 0 0 0 0 0 0 0 0 0 0 0 0 0
y-1 Z Inventory 0 0 0 0 0 0 0 0 0 0 0 0 0
y Z Incoming 0 0 0 0 0 0 0 0 0 0 0 0 0

I tried to use iterrows to get each row in df1 and sliced the comon week cells out to match df2, but it doesn't work because the chained indexing won't let me change the orignial dataframe.

If anyone can help me out? I would really appricate it!

答案1

得分: 0

你可以使用 updateMultiIndex

cols = ["Item Number", "Model", "Factor"]
df1 = df1.set_index(cols)

df1.update(df2.set_index(cols)) # 返回 `None`

df1 = df1.reset_index()

输出:

print(df1)

   Item Number Model             Factor  2/4/22  2/11/22  2/18/22  2/25/22  3/4/22  3/11/22  3/18/22  3/25/22  4/1/22  4/8/22  4/15/22  4/22/22  4/29/22
0          123     A  Sell out forecast       5        6        7        8       9       10       11       12      13       0        0        0        0
1          123     A          Inventory       0        0        0        0       0        0        0        0       0       0        0        0        0
2          123     A           Incoming       0        0        0        0       0        0        0        0       0       0        0        0        0
3          345     B  Sell out forecast       2        2        2        2       2        2        2        2       2       0        0        0        0
4          345     B          Inventory       0        0        0        0       0        0        0        0       0       0        0        0        0
5          345     B           Incoming       0        0        0        0       0        0        0        0       0       0        0        0        0
6          678     C  Sell out forecast       0        0        0        0       0        0        0        0       0       0        0        0        0
7          678     C          Inventory       0        0        0        0       0        0        0        0       0       0        0        0        0
8          678     C           Incoming       0        0        0        0       0        0        0        0       0       0        0        0        0
英文:

You can use update with a MultiIndex :

cols = ["Item Number", "Model", "Factor"]
df1 = df1.set_index(cols)

df1.update(df2.set_index(cols)) # returns `None`

df1 = df1.reset_index()

Output :

print(df1)

   Item Number Model             Factor  2/4/22  2/11/22  2/18/22  2/25/22  3/4/22  3/11/22  3/18/22  3/25/22  4/1/22  4/8/22  4/15/22  4/22/22  4/29/22
0          123     A  Sell out forecast       5        6        7        8       9       10       11       12      13       0        0        0        0
1          123     A          Inventory       0        0        0        0       0        0        0        0       0       0        0        0        0
2          123     A           Incoming       0        0        0        0       0        0        0        0       0       0        0        0        0
3          345     B  Sell out forecast       2        2        2        2       2        2        2        2       2       0        0        0        0
4          345     B          Inventory       0        0        0        0       0        0        0        0       0       0        0        0        0
5          345     B           Incoming       0        0        0        0       0        0        0        0       0       0        0        0        0
6          678     C  Sell out forecast       0        0        0        0       0        0        0        0       0       0        0        0        0
7          678     C          Inventory       0        0        0        0       0        0        0        0       0       0        0        0        0
8          678     C           Incoming       0        0        0        0       0        0        0        0       0       0        0        0        0

答案2

得分: 0

I end up with a sudo-code like this:

for index, row in df1.iterrows():
    # 更新销售预测
    for idx, r in df2.iterrows():
        if row["Factor"] == "Sell out Forecast" and row["Item Number"] == r["Item Number"]:
            df1.iloc[index, common_week_start:common_week_end+1] = r[common_week]
        break
    # 更新库存
    # 更新入库

But some pre-workouts need to do:

  1. 你需要循环遍历 df1.columndf2.column 以获取公共周列表。
  2. 设置两个变量,Common_week_startCommon_week_end
  3. 确保因子名称相同(懒人方式,哈哈。也可以使用正则表达式)。

失败的原因是一开始我使用了 df.iloc[index][common_week],链式切片会创建数据副本,每次循环只更改副本的值。现在我使用.iloc[index, column_start:column_end],它工作得很好。

英文:

I end up with a sudo-code like this:

for index, row in df1.iterrows():

    # Update Sell thru forecast

    for idx, r in df2.iterrows():

        if row["Factor"] == "Sell out Forecast" and row["Item Number"] == r["Item Number"]:

            df1.iloc[index,common_week_start:common_week_end+1]=r[common_week]


        break



    # Update Inventory

    # Update Incoming



But some pre-workouts need to do:

  1. You need to for loop the df1.column and df2.column to get the common week list.
  2. Set two variables, Common_week_start and Common_week_end
  3. Make sure the Factor name is identical (Lazy way, lol. Also, could use the regular expression)

Reason I was failed at the very beginning is that I used the df.iloc[index][common_week] the chain slicing creates a copy of the data, each loop only changes the copy values. Now I used.iloc[index,column_start:column_end], it works very well.

huangapple
  • 本文由 发表于 2023年5月18日 04:49:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/76276105.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定