根据参考表填写数值。

huangapple go评论105阅读模式
英文:

Fill values based on reference table

问题

  1. import pandas as pd
  2. # Your reference dataframe
  3. reference_df = pd.DataFrame({
  4. 'Product': ['Shirt', 'Sneakers', 'Pants', 'Tennis ball', 'Football ball', 'Football boots'],
  5. 'Brand': ['Zara', 'Nike', 'Zara', 'Wilson', 'Adidas', 'Adidas']
  6. })
  7. # Your dataframe to be filled
  8. df = pd.DataFrame({
  9. 'Product': ['Shirt', 'Shirt', 'Pants', 'Tennis ball', 'Shirt', 'Football boots', 'Football boots', 'Football boots', 'Pants', 'Sneakers', 'Football ball', 'Football boots'],
  10. 'Brand': [None, None, None, None, None, None, None, None, None, None, None, None]
  11. })
  12. # Use map to fill the 'Brand' column based on the 'Product' from the reference dataframe
  13. df['Brand'] = df['Product'].map(reference_df.set_index('Product')['Brand'])
  14. # Print the resulting dataframe
  15. df

This code will fill the 'Brand' column in your dataframe based on the 'Product' using the reference dataframe, giving you the desired output.

英文:

I have two df with different lenghts, one (reference/dictionary) with various types of products and brand who makes them, and another df with just the products, I have to fill brand based on the reference table

Reference df is like

Product Brand
Shirt Zara
Sneakers Nike
Pants Zara
Tennis ball Wilson
Football ball Adidas
Football boots Adidas

The df to be filled is something like

Product Brand
Shirt NaN
Shirt NaN
Pants NaN
Tennis ball NaN
Shirt NaN
Football boots NaN
Football boots NaN
Football boots NaN
Pants NaN
Sneakers NaN
Football ball NaN
Football boots NaN
(+100k rows) NaN

I´ve tried the following code

  1. df['Brand'] = df['Brand'].map(referencedf.set_index('Product')['Brand'])
  2. df

Unfortunately, the output is not the desired and it always returns the same brand for all products

Product Brand
Shirt Zara
Shirt Zara
Pants Zara
Tennis ball Zara
Shirt Zara
Football boots Zara
Football boots Zara
Football boots Zara
Pants Zara
Sneakers Zara
Football ball Zara
Football boots Zara
(+100k rows) Zara

Any ideas on how can i get the output right?

答案1

得分: 0

  1. df['Brand'] = df['Product'].map(reference_df.set_index('Product')['Brand'])
  2. print(df)

输出:

  1. Product Brand
  2. 0 衬衫 Zara
  3. 1 衬衫 Zara
  4. 2 裤子 Zara
  5. 3 网球 Wilson
  6. 4 衬衫 Zara
  7. 5 足球鞋 Adidas
  8. 6 足球鞋 Adidas
  9. 7 足球鞋 Adidas
  10. 8 裤子 Zara
  11. 9 运动鞋 Nike
  12. 10 足球 Adidas
  13. 11 足球鞋 Adidas
英文:

Try:

  1. df['Brand'] = df['Product'].map(reference_df.set_index('Product')['Brand'])
  2. print(df)

Prints:

  1. Product Brand
  2. 0 Shirt Zara
  3. 1 Shirt Zara
  4. 2 Pants Zara
  5. 3 Tennis ball Wilson
  6. 4 Shirt Zara
  7. 5 Football boots Adidas
  8. 6 Football boots Adidas
  9. 7 Football boots Adidas
  10. 8 Pants Zara
  11. 9 Sneakers Nike
  12. 10 Football ball Adidas
  13. 11 Football boots Adidas

huangapple
  • 本文由 发表于 2023年7月28日 02:05:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/76782373.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定