英文:
Matching column values with elements in a list of lists of lists and adding corresponding values from another list
问题
assets = [["Ferrari", "BMW", "Suzuki"], ["Ducati", "Honda"], ["Apple", "Samsung", "Oppo"]]
price = [[853600, 462300, 118900], [96500, 16700], [1260, 750, 340]]
# Convert the data into a dictionary for easy access
data = {}
for i in range(len(assets)):
for j in range(len(assets[i])):
data[assets[i][j]] = price[i][j]
# Create a DataFrame from the data
import pandas as pd
df = pd.DataFrame(data.items(), columns=["Item", "Price"])
# Calculate the Total Cost
total_cost = df["Price"].sum()
# Add the Total Cost to the DataFrame
df["Total Cost"] = total_cost
# If you want to display the DataFrame
print(df)
英文:
assets = [[['Ferrari', 'BMW', 'Suzuki'], ['Ducati', 'Honda']], [['Apple', 'Samsung', 'Oppo']]]
price = [[[853600, 462300, 118900], [96500, 16700]], [[1260, 750, 340]]]
I have a dataframe as follows :
Car | Bike | Phone |
---|---|---|
BMW | Ducati | Apple |
Ferrari | Honda | Oppo |
Looking for code to get the Total_Cost , i.e 462300 + 96500 + 1260 = 560060
Car | Bike | Phone | Total Cost |
---|---|---|---|
BMW | Ducati | Apple | 560060 |
Ferrari | Honda | Oppo | 870640 |
I tried the for loop and succeeded, I want the advanced code if any.
答案1
得分: 1
这是一种可能的解决方案:
df = pd.DataFrame({'Car': ['宝马', '法拉利'], 'Bike': ['杜卡迪', '本田'], 'Phone': ['苹果', 'Oppo']})
asset_price = {asset: price[a][b][c]
for a, asset_list in enumerate(assets)
for b, asset_sub_list in enumerate(asset_list)
for c, asset in enumerate(asset_sub_list)
}
df['总成本'] = df.apply(lambda row: sum([asset_price[asset] for asset in row]), axis=1)
print(df)
你也可以根据你的用例使用numpy方法import numpy as np
。但我建议使用第一种方法,因为它更简单易懂。
df = pd.DataFrame({'Car': ['宝马', '法拉利'], 'Bike': ['杜卡迪', '本田'], 'Phone': ['苹果', 'Oppo']})
flat_assets = np.concatenate([np.concatenate(row) for row in assets])
flat_price = np.concatenate([np.concatenate(row) for row in price])
asset_dict = dict(zip(flat_assets, flat_price))
asset_prices = np.array([asset_dict[row] for row in df.values.flatten()
if row in asset_dict])
df['总成本'] = np.sum(asset_prices.reshape(-1, 3), axis=1)
print(df)
英文:
Here is a possible solution:
df = pd.DataFrame({'Car': ['BMW', 'Ferrari'], 'Bike': ['Ducati', 'Honda'], 'Phone': ['Apple', 'Oppo']})
asset_price = {asset: price[a][b][c]
for a, asset_list in enumerate(assets)
for b, asset_sub_list in enumerate(asset_list)
for c, asset in enumerate(asset_sub_list)
}
df['Total_Cost'] = df.apply(lambda row: sum([asset_price[asset] for asset in row]), axis=1)
print(df)
Car Bike Phone Total_Cost
0 BMW Ducati Apple 560060
1 Ferrari Honda Oppo 870640
You can also use numpy approach import numpy as np
depending on your use-case. But I will suggest the first approach which is more simple and easy to understand.
df = pd.DataFrame({'Car': ['BMW', 'Ferrari'], 'Bike': ['Ducati', 'Honda'], 'Phone': ['Apple', 'Oppo']})
flat_assets = np.concatenate([np.concatenate(row) for row in assets])
flat_price = np.concatenate([np.concatenate(row) for row in price])
asset_dict = dict(zip(flat_assets, flat_price))
asset_prices = np.array([asset_dict[row] for row in df.values.flatten()
if row in asset_dict])
df['Total Cost'] = np.sum(asset_prices.reshape(-1, 3), axis=1)
print(df)
Car Bike Phone Total Cost
0 BMW Ducati Apple 560060
1 Ferrari Honda Oppo 870640
答案2
得分: 0
一个替代方法:
首先构建一个数据框df_price
,将price
映射到assets
和分类(Car
,Bike
和Phone
):
df_price = (
pd.DataFrame({"assets": assets, "price": price}).explode(["assets", "price"])
.assign(cols=["Car", "Bike", "Phone"]).explode(["assets", "price"])
)
结果:
assets price cols
0 Ferrari 853600 Car
0 BMW 462300 Car
0 Suzuki 118900 Car
0 Ducati 96500 Bike
0 Honda 16700 Bike
1 Apple 1260 Phone
1 Samsung 750 Phone
1 Oppo 340 Phone
(我在这里插入了分类,因为在其他答案的评论中有这样的说法:“...但是如果资产的嵌套列表有共同的名称(比如:在Suzuki的地方是Honda),那么Honda汽车和Honda摩托车将共享一个价格”。)
然后将价格加入到.melt
后的主数据框df
,使用辅助列idx
进行.pivot
,在行中总结价格,并整理结果。
res = (
df.melt(var_name="cols", value_name="assets", ignore_index=False)
.merge(df_price, on=["cols", "assets"])
.assign(idx=lambda df: df.groupby("cols").cumcount())
.pivot(index="idx", columns="cols")
.assign(total=lambda df: df.loc[:, "price"].sum(axis=1))
.loc[:, ["assets", "total"]]
.droplevel(0, axis=1).rename(columns={"" : "Total_Costs"})
)
结果:
cols Bike Car Phone Total_Costs
idx
0 Ducati BMW Apple 560060.0
1 Honda Ferrari Oppo 870640.0
英文:
An alternative approach:
First build a dataframe df_price
which maps price
s onto the assets
and the classification (Car
, Bike
, and Phone
):
df_price = (
pd.DataFrame({"assets": assets, "price": price}).explode(["assets", "price"])
.assign(cols=["Car", "Bike", "Phone"]).explode(["assets", "price"])
)
Result:
assets price cols
0 Ferrari 853600 Car
0 BMW 462300 Car
0 Suzuki 118900 Car
0 Ducati 96500 Bike
0 Honda 16700 Bike
1 Apple 1260 Phone
1 Samsung 750 Phone
1 Oppo 340 Phone
(I have inserted the classification here due to the comment on the other answer: "... But if the nested lists of asset is having common name (say : Honda in place if Suzuki ) then Honda car and Honda Bike will take one price".
Then join the prices onto the .melt
ed main dataframe df
, .pivot
(using the auxilliary column idx
), sum up the prices in the rows, and bring the result in shape.
res = (
df.melt(var_name="cols", value_name="assets", ignore_index=False)
.merge(df_price, on=["cols", "assets"])
.assign(idx=lambda df: df.groupby("cols").cumcount())
.pivot(index="idx", columns="cols")
.assign(total=lambda df: df.loc[:, "price"].sum(axis=1))
.loc[:, ["assets", "total"]]
.droplevel(0, axis=1).rename(columns={"": "Total_Costs"})
)
Result:
cols Bike Car Phone Total_Costs
idx
0 Ducati BMW Apple 560060.0
1 Honda Ferrari Oppo 870640.0
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论