英文:
Matching column values with elements in a list of lists of lists and adding corresponding values from another list
问题
assets = [["Ferrari", "BMW", "Suzuki"], ["Ducati", "Honda"], ["Apple", "Samsung", "Oppo"]]
price = [[853600, 462300, 118900], [96500, 16700], [1260, 750, 340]]
# Convert the data into a dictionary for easy access
data = {}
for i in range(len(assets)):
for j in range(len(assets[i])):
data[assets[i][j]] = price[i][j]
# Create a DataFrame from the data
import pandas as pd
df = pd.DataFrame(data.items(), columns=["Item", "Price"])
# Calculate the Total Cost
total_cost = df["Price"].sum()
# Add the Total Cost to the DataFrame
df["Total Cost"] = total_cost
# If you want to display the DataFrame
print(df)
英文:
assets = [[['Ferrari', 'BMW', 'Suzuki'], ['Ducati', 'Honda']], [['Apple', 'Samsung', 'Oppo']]]
price = [[[853600, 462300, 118900], [96500, 16700]], [[1260, 750, 340]]]
I have a dataframe as follows :
| Car | Bike | Phone |
|---|---|---|
| BMW | Ducati | Apple |
| Ferrari | Honda | Oppo |
Looking for code to get the Total_Cost , i.e 462300 + 96500 + 1260 = 560060
| Car | Bike | Phone | Total Cost |
|---|---|---|---|
| BMW | Ducati | Apple | 560060 |
| Ferrari | Honda | Oppo | 870640 |
I tried the for loop and succeeded, I want the advanced code if any.
答案1
得分: 1
这是一种可能的解决方案:
df = pd.DataFrame({'Car': ['宝马', '法拉利'], 'Bike': ['杜卡迪', '本田'], 'Phone': ['苹果', 'Oppo']})
asset_price = {asset: price[a][b][c]
for a, asset_list in enumerate(assets)
for b, asset_sub_list in enumerate(asset_list)
for c, asset in enumerate(asset_sub_list)
}
df['总成本'] = df.apply(lambda row: sum([asset_price[asset] for asset in row]), axis=1)
print(df)
你也可以根据你的用例使用numpy方法import numpy as np。但我建议使用第一种方法,因为它更简单易懂。
df = pd.DataFrame({'Car': ['宝马', '法拉利'], 'Bike': ['杜卡迪', '本田'], 'Phone': ['苹果', 'Oppo']})
flat_assets = np.concatenate([np.concatenate(row) for row in assets])
flat_price = np.concatenate([np.concatenate(row) for row in price])
asset_dict = dict(zip(flat_assets, flat_price))
asset_prices = np.array([asset_dict[row] for row in df.values.flatten()
if row in asset_dict])
df['总成本'] = np.sum(asset_prices.reshape(-1, 3), axis=1)
print(df)
英文:
Here is a possible solution:
df = pd.DataFrame({'Car': ['BMW', 'Ferrari'], 'Bike': ['Ducati', 'Honda'], 'Phone': ['Apple', 'Oppo']})
asset_price = {asset: price[a][b][c]
for a, asset_list in enumerate(assets)
for b, asset_sub_list in enumerate(asset_list)
for c, asset in enumerate(asset_sub_list)
}
df['Total_Cost'] = df.apply(lambda row: sum([asset_price[asset] for asset in row]), axis=1)
print(df)
Car Bike Phone Total_Cost
0 BMW Ducati Apple 560060
1 Ferrari Honda Oppo 870640
You can also use numpy approach import numpy as np depending on your use-case. But I will suggest the first approach which is more simple and easy to understand.
df = pd.DataFrame({'Car': ['BMW', 'Ferrari'], 'Bike': ['Ducati', 'Honda'], 'Phone': ['Apple', 'Oppo']})
flat_assets = np.concatenate([np.concatenate(row) for row in assets])
flat_price = np.concatenate([np.concatenate(row) for row in price])
asset_dict = dict(zip(flat_assets, flat_price))
asset_prices = np.array([asset_dict[row] for row in df.values.flatten()
if row in asset_dict])
df['Total Cost'] = np.sum(asset_prices.reshape(-1, 3), axis=1)
print(df)
Car Bike Phone Total Cost
0 BMW Ducati Apple 560060
1 Ferrari Honda Oppo 870640
答案2
得分: 0
一个替代方法:
首先构建一个数据框df_price,将price映射到assets和分类(Car,Bike和Phone):
df_price = (
pd.DataFrame({"assets": assets, "price": price}).explode(["assets", "price"])
.assign(cols=["Car", "Bike", "Phone"]).explode(["assets", "price"])
)
结果:
assets price cols
0 Ferrari 853600 Car
0 BMW 462300 Car
0 Suzuki 118900 Car
0 Ducati 96500 Bike
0 Honda 16700 Bike
1 Apple 1260 Phone
1 Samsung 750 Phone
1 Oppo 340 Phone
(我在这里插入了分类,因为在其他答案的评论中有这样的说法:“...但是如果资产的嵌套列表有共同的名称(比如:在Suzuki的地方是Honda),那么Honda汽车和Honda摩托车将共享一个价格”。)
然后将价格加入到.melt后的主数据框df,使用辅助列idx进行.pivot,在行中总结价格,并整理结果。
res = (
df.melt(var_name="cols", value_name="assets", ignore_index=False)
.merge(df_price, on=["cols", "assets"])
.assign(idx=lambda df: df.groupby("cols").cumcount())
.pivot(index="idx", columns="cols")
.assign(total=lambda df: df.loc[:, "price"].sum(axis=1))
.loc[:, ["assets", "total"]]
.droplevel(0, axis=1).rename(columns={"" : "Total_Costs"})
)
结果:
cols Bike Car Phone Total_Costs
idx
0 Ducati BMW Apple 560060.0
1 Honda Ferrari Oppo 870640.0
英文:
An alternative approach:
First build a dataframe df_price which maps prices onto the assets and the classification (Car, Bike, and Phone):
df_price = (
pd.DataFrame({"assets": assets, "price": price}).explode(["assets", "price"])
.assign(cols=["Car", "Bike", "Phone"]).explode(["assets", "price"])
)
Result:
assets price cols
0 Ferrari 853600 Car
0 BMW 462300 Car
0 Suzuki 118900 Car
0 Ducati 96500 Bike
0 Honda 16700 Bike
1 Apple 1260 Phone
1 Samsung 750 Phone
1 Oppo 340 Phone
(I have inserted the classification here due to the comment on the other answer: "... But if the nested lists of asset is having common name (say : Honda in place if Suzuki ) then Honda car and Honda Bike will take one price".
Then join the prices onto the .melted main dataframe df, .pivot (using the auxilliary column idx), sum up the prices in the rows, and bring the result in shape.
res = (
df.melt(var_name="cols", value_name="assets", ignore_index=False)
.merge(df_price, on=["cols", "assets"])
.assign(idx=lambda df: df.groupby("cols").cumcount())
.pivot(index="idx", columns="cols")
.assign(total=lambda df: df.loc[:, "price"].sum(axis=1))
.loc[:, ["assets", "total"]]
.droplevel(0, axis=1).rename(columns={"": "Total_Costs"})
)
Result:
cols Bike Car Phone Total_Costs
idx
0 Ducati BMW Apple 560060.0
1 Honda Ferrari Oppo 870640.0
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论