根据参考表填写数值。

huangapple go评论62阅读模式
英文:

Fill values based on reference table

问题

import pandas as pd

# Your reference dataframe
reference_df = pd.DataFrame({
    'Product': ['Shirt', 'Sneakers', 'Pants', 'Tennis ball', 'Football ball', 'Football boots'],
    'Brand': ['Zara', 'Nike', 'Zara', 'Wilson', 'Adidas', 'Adidas']
})

# Your dataframe to be filled
df = pd.DataFrame({
    'Product': ['Shirt', 'Shirt', 'Pants', 'Tennis ball', 'Shirt', 'Football boots', 'Football boots', 'Football boots', 'Pants', 'Sneakers', 'Football ball', 'Football boots'],
    'Brand': [None, None, None, None, None, None, None, None, None, None, None, None]
})

# Use map to fill the 'Brand' column based on the 'Product' from the reference dataframe
df['Brand'] = df['Product'].map(reference_df.set_index('Product')['Brand'])

# Print the resulting dataframe
df

This code will fill the 'Brand' column in your dataframe based on the 'Product' using the reference dataframe, giving you the desired output.

英文:

I have two df with different lenghts, one (reference/dictionary) with various types of products and brand who makes them, and another df with just the products, I have to fill brand based on the reference table

Reference df is like

Product Brand
Shirt Zara
Sneakers Nike
Pants Zara
Tennis ball Wilson
Football ball Adidas
Football boots Adidas

The df to be filled is something like

Product Brand
Shirt NaN
Shirt NaN
Pants NaN
Tennis ball NaN
Shirt NaN
Football boots NaN
Football boots NaN
Football boots NaN
Pants NaN
Sneakers NaN
Football ball NaN
Football boots NaN
(+100k rows) NaN

I´ve tried the following code

df['Brand'] = df['Brand'].map(referencedf.set_index('Product')['Brand'])
df

Unfortunately, the output is not the desired and it always returns the same brand for all products

Product Brand
Shirt Zara
Shirt Zara
Pants Zara
Tennis ball Zara
Shirt Zara
Football boots Zara
Football boots Zara
Football boots Zara
Pants Zara
Sneakers Zara
Football ball Zara
Football boots Zara
(+100k rows) Zara

Any ideas on how can i get the output right?

答案1

得分: 0

df['Brand'] = df['Product'].map(reference_df.set_index('Product')['Brand'])
print(df)

输出:

           Product   Brand
0            衬衫      Zara
1            衬衫      Zara
2            裤子      Zara
3      网球      Wilson
4            衬衫      Zara
5   足球鞋      Adidas
6   足球鞋      Adidas
7   足球鞋      Adidas
8            裤子      Zara
9         运动鞋    Nike
10   足球      Adidas
11  足球鞋      Adidas
英文:

Try:

df['Brand'] = df['Product'].map(reference_df.set_index('Product')['Brand'])
print(df)

Prints:

           Product   Brand
0            Shirt    Zara
1            Shirt    Zara
2            Pants    Zara
3      Tennis ball  Wilson
4            Shirt    Zara
5   Football boots  Adidas
6   Football boots  Adidas
7   Football boots  Adidas
8            Pants    Zara
9         Sneakers    Nike
10   Football ball  Adidas
11  Football boots  Adidas

huangapple
  • 本文由 发表于 2023年7月28日 02:05:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/76782373.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定