2020年1月6日 21:36:53go评论114阅读模式

英文:

save a pandas groupby object into a csv file

问题

你想要将每个供应商分组的数据保存到单独的CSV文件中，对吗？以下是您提供的代码的翻译部分：

import pandas as pd
# 创建一个DataFrame
dff = pd.DataFrame({'SKU': ['001', '002', '003'],
                    'revenue_contribution_in_percentage': [0.2, 0.5, 0.3],
                    'BuyPrice': [2, 3, 4],
                    'SellPrice': [5, 6, 6],
                    'margin': [3, 3, 2],
                    'Avg_per_week': [3, 2, 5],
                    'StockOnHand': [4, 10, 20],
                    'StockOnOrder': [0, 0, 0],
                    'Supplier': ['ABC', 'ABC', 'ABZ'],
                    'SupplierLeadTime': [5, 5, 5],
                    'cumul_value': [0.4, 0.6, 1],
                    'class_mention': ['A', 'A', 'B'],
                    'std_week': [1, 2, 1],
                    'review_time': [2, 2, 2],
                    'holding_cost': [0.35, 0.35, 0.35],
                    'aggregate_order_placement_cost': [200, 230, 210]})
# 根据'Supplier'列进行分组
groups = [group.reset_index().set_index(['SKU'])[['revenue_contribution_in_percentage',
                                                  'BuyPrice',
                                                  'SellPrice',
                                                  'margin',
                                                  'Avg_per_week',
                                                  'StockOnHand',
                                                  'StockOnOrder',
                                                  'Supplier',
                                                  'SupplierLeadTime',
                                                  'cumul_value',
                                                  'class_mention',
                                                  'std_week',
                                                  'review_time',
                                                  'holding_cost',
                                                  'aggregate_order_placement_cost',
                                                  'periods']] for _, group in dff.groupby('Supplier')]
# 创建一个新的DataFrame并将分组数据合并
df_group = pd.concat(groups)
# 将合并后的数据保存为CSV文件
df_group.to_csv('results.csv', index=False)

希望这可以帮助您将数据按供应商分组并保存到单独的CSV文件中。如果您需要更多帮助，请随时提出问题。

英文:

I have an issue that I cannot figure out although I read similar posts such as https://stackoverflow.com/questions/47602097/pandas-groupby-to-to-csv. It does not work for me. I am trying to write code to separate each group from a groupby object and save each group into its own excel spreadsheet.

I attached a toy example of the code that I did to get my groupby object on pandas with some columns.

Now, I need to save each group from this object into a separate csv file, or at least in a separate worksheet in excel.

dff = pd.DataFrame({&#39;SKU&#39;: [&#39;001&#39;, &#39;002&#39;, &#39;003&#39;],
&#39;revenue_contribution_in_percentage&#39;: [0.2, 0.5, 0.3],
&#39;BuyPrice&#39; : [2,3,4],
&#39;SellPrice&#39; : [5,6,6],
&#39;margin&#39; : [3,3,2],
&#39;Avg_per_week&#39; : [3,2,5],
&#39;StockOnHand&#39; : [4,10,20],
&#39;StockOnOrder&#39;: [0,0,0],
&#39;Supplier&#39; : [&#39;ABC&#39;, &#39;ABC&#39;, &#39;ABZ&#39; ],
&#39;SupplierLeadTime&#39;: [5,5,5],
&#39;cumul_value&#39;:[0.4,0.6,1],
&#39;class_mention&#39;:[&#39;A&#39;,&#39;A&#39;,&#39;B&#39;],
&#39;std_week&#39;:[1,2,1],
&#39;review_time&#39; : [2,2,2],
&#39;holding_cost&#39;: [0.35, 0.35, 0.35],
&#39;aggregate_order_placement_cost&#39;: [200, 230,210]
})

I have done the following to get a groupby supplier object

groups = [group.reset_index().set_index([&#39;SKU&#39;])[[
&#39;revenue_contribution_in_percentage&#39;,
&#39;BuyPrice&#39;,
&#39;SellPrice&#39;,
&#39;margin&#39;,
&#39;Avg_per_week&#39;,
&#39;StockOnHand&#39;,
&#39;StockOnOrder&#39;,
&#39;Supplier&#39;,
&#39;SupplierLeadTime&#39;,
&#39;cumul_value&#39;,
&#39;class_mention&#39;,
&#39;std_week&#39;,
&#39;review_time&#39;,
&#39;holding_cost&#39;,
&#39;aggregate_order_placement_cost&#39;,
&#39;periods&#39;]] for _, group in dff.groupby(&#39;Supplier&#39;)]
df_group = pd.DataFrame(groups).sum()
group_to_excel = df_group.to_csv(&#39;results.csv&#39;)

and the output that I would like to get is the folowing: two distinct datasets that can saved in csv format and look like this:

   SKU  revenue_contribution_in_percentage  BuyPrice  SellPrice  margin  \
0  001                                 0.2         2          5       3   
1  002                                 0.5         3          6       3   
Avg_per_week  StockOnHand  StockOnOrder Supplier  SupplierLeadTime  \
0             3            4             0      ABC                 5   
1             2           10             0      ABC                 5   
cumul_value class_mention  std_week  review_time  holding_cost  \
0          0.4             A         1            2          0.35   
1          0.6             A         2            2          0.35   
aggregate_order_placement_cost  
0                             200  
1                             230

and

   SKU  revenue_contribution_in_percentage  BuyPrice  SellPrice  margin  \
0  003                                 0.3         4          6       2   
Avg_per_week  StockOnHand  StockOnOrder Supplier  SupplierLeadTime  \
0             5           20             0      ABZ                 5   
cumul_value class_mention  std_week  review_time  holding_cost  \
0            1             B         1            2          0.35   
aggregate_order_placement_cost  
0                             210

At this point my code give one and only worksheet (horrendous worksheet) with pretty much nothing on it. I am not sure what is wrong at this point.
I would greatly appreciate some help on this one! thanks a lot!

答案1

得分: 1

不需要使用groupby，因为您没有在进行聚合操作。您真正想要的是按每个唯一的供应商切片dff，然后将它们导出到各自的文件中。请尝试以下代码：

cols = [
    'SKU',
    'revenue_contribution_in_percentage',
    'BuyPrice',
    'SellPrice',
    'margin',
    'Avg_per_week',
    'StockOnHand',
    'StockOnOrder',
    'Supplier',
    'SupplierLeadTime',
    'cumul_value',
    'class_mention',
    'std_week',
    'review_time',
    'holding_cost',
    'aggregate_order_placement_cost'
]
for supplier in dff['Supplier'].unique():
    sub_dff = dff[dff['Supplier'] == supplier][cols]
    sub_dff.to_csv(f'{supplier}_data.csv')

英文:

You don't need groupby as you are not aggregating anything. What you really want is to slice dff by each unique supplier and export them to their own files. Try this:

cols = [
&#39;SKU&#39;,
&#39;revenue_contribution_in_percentage&#39;,
&#39;BuyPrice&#39;,
&#39;SellPrice&#39;,
&#39;margin&#39;,
&#39;Avg_per_week&#39;,
&#39;StockOnHand&#39;,
&#39;StockOnOrder&#39;,
&#39;Supplier&#39;,
&#39;SupplierLeadTime&#39;,
&#39;cumul_value&#39;,
&#39;class_mention&#39;,
&#39;std_week&#39;,
&#39;review_time&#39;,
&#39;holding_cost&#39;,
&#39;aggregate_order_placement_cost&#39;
]
for supplier in dff[&#39;Supplier&#39;].unique():
sub_dff = dff[dff[&#39;Supplier&#39;] == supplier][cols]
sub_dff.to_csv(f&#39;{supplier}_data.csv&#39;)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将一个 Pandas 的 groupby 对象保存到一个 CSV 文件中。

问题

答案1

连接Postgresql到Django

Python基本概念我是一个数学白痴 SOS

基于最后日期合并两个数据框。

PyGame 矢量单独重力功能不起作用。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。