英文:
save a pandas groupby object into a csv file
问题
你想要将每个供应商分组的数据保存到单独的CSV文件中,对吗?以下是您提供的代码的翻译部分:
import pandas as pd
# 创建一个DataFrame
dff = pd.DataFrame({'SKU': ['001', '002', '003'],
'revenue_contribution_in_percentage': [0.2, 0.5, 0.3],
'BuyPrice': [2, 3, 4],
'SellPrice': [5, 6, 6],
'margin': [3, 3, 2],
'Avg_per_week': [3, 2, 5],
'StockOnHand': [4, 10, 20],
'StockOnOrder': [0, 0, 0],
'Supplier': ['ABC', 'ABC', 'ABZ'],
'SupplierLeadTime': [5, 5, 5],
'cumul_value': [0.4, 0.6, 1],
'class_mention': ['A', 'A', 'B'],
'std_week': [1, 2, 1],
'review_time': [2, 2, 2],
'holding_cost': [0.35, 0.35, 0.35],
'aggregate_order_placement_cost': [200, 230, 210]})
# 根据'Supplier'列进行分组
groups = [group.reset_index().set_index(['SKU'])[['revenue_contribution_in_percentage',
'BuyPrice',
'SellPrice',
'margin',
'Avg_per_week',
'StockOnHand',
'StockOnOrder',
'Supplier',
'SupplierLeadTime',
'cumul_value',
'class_mention',
'std_week',
'review_time',
'holding_cost',
'aggregate_order_placement_cost',
'periods']] for _, group in dff.groupby('Supplier')]
# 创建一个新的DataFrame并将分组数据合并
df_group = pd.concat(groups)
# 将合并后的数据保存为CSV文件
df_group.to_csv('results.csv', index=False)
希望这可以帮助您将数据按供应商分组并保存到单独的CSV文件中。如果您需要更多帮助,请随时提出问题。
英文:
I have an issue that I cannot figure out although I read similar posts such as https://stackoverflow.com/questions/47602097/pandas-groupby-to-to-csv. It does not work for me. I am trying to write code to separate each group from a groupby object and save each group into its own excel spreadsheet.
I attached a toy example of the code that I did to get my groupby object on pandas with some columns.
Now, I need to save each group from this object into a separate csv file, or at least in a separate worksheet in excel.
dff = pd.DataFrame({'SKU': ['001', '002', '003'],
'revenue_contribution_in_percentage': [0.2, 0.5, 0.3],
'BuyPrice' : [2,3,4],
'SellPrice' : [5,6,6],
'margin' : [3,3,2],
'Avg_per_week' : [3,2,5],
'StockOnHand' : [4,10,20],
'StockOnOrder': [0,0,0],
'Supplier' : ['ABC', 'ABC', 'ABZ' ],
'SupplierLeadTime': [5,5,5],
'cumul_value':[0.4,0.6,1],
'class_mention':['A','A','B'],
'std_week':[1,2,1],
'review_time' : [2,2,2],
'holding_cost': [0.35, 0.35, 0.35],
'aggregate_order_placement_cost': [200, 230,210]
})
I have done the following to get a groupby supplier object
groups = [group.reset_index().set_index(['SKU'])[[
'revenue_contribution_in_percentage',
'BuyPrice',
'SellPrice',
'margin',
'Avg_per_week',
'StockOnHand',
'StockOnOrder',
'Supplier',
'SupplierLeadTime',
'cumul_value',
'class_mention',
'std_week',
'review_time',
'holding_cost',
'aggregate_order_placement_cost',
'periods']] for _, group in dff.groupby('Supplier')]
df_group = pd.DataFrame(groups).sum()
group_to_excel = df_group.to_csv('results.csv')
and the output that I would like to get is the folowing: two distinct datasets that can saved in csv format and look like this:
SKU revenue_contribution_in_percentage BuyPrice SellPrice margin \
0 001 0.2 2 5 3
1 002 0.5 3 6 3
Avg_per_week StockOnHand StockOnOrder Supplier SupplierLeadTime \
0 3 4 0 ABC 5
1 2 10 0 ABC 5
cumul_value class_mention std_week review_time holding_cost \
0 0.4 A 1 2 0.35
1 0.6 A 2 2 0.35
aggregate_order_placement_cost
0 200
1 230
and
SKU revenue_contribution_in_percentage BuyPrice SellPrice margin \
0 003 0.3 4 6 2
Avg_per_week StockOnHand StockOnOrder Supplier SupplierLeadTime \
0 5 20 0 ABZ 5
cumul_value class_mention std_week review_time holding_cost \
0 1 B 1 2 0.35
aggregate_order_placement_cost
0 210
At this point my code give one and only worksheet (horrendous worksheet) with pretty much nothing on it. I am not sure what is wrong at this point.
I would greatly appreciate some help on this one! thanks a lot!
答案1
得分: 1
不需要使用groupby
,因为您没有在进行聚合操作。您真正想要的是按每个唯一的供应商切片dff
,然后将它们导出到各自的文件中。请尝试以下代码:
cols = [
'SKU',
'revenue_contribution_in_percentage',
'BuyPrice',
'SellPrice',
'margin',
'Avg_per_week',
'StockOnHand',
'StockOnOrder',
'Supplier',
'SupplierLeadTime',
'cumul_value',
'class_mention',
'std_week',
'review_time',
'holding_cost',
'aggregate_order_placement_cost'
]
for supplier in dff['Supplier'].unique():
sub_dff = dff[dff['Supplier'] == supplier][cols]
sub_dff.to_csv(f'{supplier}_data.csv')
英文:
You don't need groupby
as you are not aggregating anything. What you really want is to slice dff
by each unique supplier and export them to their own files. Try this:
cols = [
'SKU',
'revenue_contribution_in_percentage',
'BuyPrice',
'SellPrice',
'margin',
'Avg_per_week',
'StockOnHand',
'StockOnOrder',
'Supplier',
'SupplierLeadTime',
'cumul_value',
'class_mention',
'std_week',
'review_time',
'holding_cost',
'aggregate_order_placement_cost'
]
for supplier in dff['Supplier'].unique():
sub_dff = dff[dff['Supplier'] == supplier][cols]
sub_dff.to_csv(f'{supplier}_data.csv')
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论