我想在我的Excel第一页创建一个摘要页面。

huangapple go评论97阅读模式
英文:

I want to create a summary page in the first page of my excel

问题

我有一个包含数据的CSV表格。我想根据文本文件将CSV中的数据类型减少到27种类型。筛选后,我希望Excel文件包含27个工作表,每个工作表上都有一个数据类型。现在,我想创建一个摘要工作表,第一列包含数据类型的名称,第二列包含各自工作表上每种数据类型的总行数。在最后一步中,我希望我的摘要工作表排在第一位。我的代码似乎在最后一步之前都能正常工作,但摘要工作表无法放在第一位。

import pandas as pd

# 将CSV文件加载到DataFrame中
df = pd.read_csv('my_file.csv')

# 从文本文件中读取数据类型
with open('data_types.txt', 'r') as file:
    data_types = file.read().splitlines()

# 筛选DataFrame,仅包含指定的数据类型
filtered_df = df[df['a'].isin(data_types)]

# 为每种数据类型创建一个单独的工作表
writer = pd.ExcelWriter('filtered_data.xlsx', engine='xlsxwriter')

summary_data = []
for data_type in data_types:
    # 为当前数据类型筛选DataFrame
    type_filtered_df = filtered_df[filtered_df['a'] == data_type]

    # 将筛选后的数据写入Excel文件的新工作表
    type_filtered_df.to_excel(writer, sheet_name=f'Type_{data_type}', index=False)

    # 添加摘要数据
    summary_data.append([data_type, len(type_filtered_df)])

# 创建摘要DataFrame
summary_df = pd.DataFrame(summary_data, columns=['数据类型', '总行数'])

# 为摘要DataFrame的两列添加标题行
summary_df_header = pd.DataFrame([['数据类型', '总行数']], columns=['数据类型'])
summary_df = pd.concat([summary_df_header, summary_df])

# 将DataFrame保存到'Summary'工作表
summary_df.to_excel(writer, sheet_name='摘要', index=False)

# 访问工作簿和摘要工作表
workbook = writer.book
summary_sheet = writer.sheets['摘要']

# 合并标题行的单元格
summary_sheet.merge_range('A1:B1', '数据类型')

# 重新排列工作表,使摘要工作表成为第一个工作表
sheets = workbook.sheetnames
summary_sheet_index = sheets.index('摘要')
sheets.insert(0, sheets.pop(summary_sheet_index))
for i, sheet_name in enumerate(sheets):
    workbook.set_sheet_order(sheet_name, i)

# 保存并关闭Excel文件
writer.save()
英文:

I have a table of data in my csv. I want to reduce the data_type in my csv to 27 types according to the text file. After the filter, I want the Excel to contain 27 pages with each data type on each page. Now I want to make a summary page with the first column containing the names of the data types and the second column containing the total amount of rows of each data type on their respective pages. For the last step, I want my summary page to be arranged as the first page. My codes seem to work fine until the last step whereas the summary page cannot be placed to the first page.

import pandas as pd

# Load the CSV file into a DataFrame
df = pd.read_csv('my_file.csv')

# Read the data types from a text file
with open('data_types.txt', 'r') as file:
    data_types = file.read().splitlines()

# Filter the DataFrame to include only the specified data types
filtered_df = df[df['a'].isin(data_types)]

# Create a separate sheet for each type of data
writer = pd.ExcelWriter('filtered_data.xlsx', engine='xlsxwriter')

summary_data = []
for data_type in data_types:
    # Filter the DataFrame for the current data type
    type_filtered_df = filtered_df[filtered_df['a'] == data_type]
    
    # Write the filtered data to a new sheet in the Excel file
    type_filtered_df.to_excel(writer, sheet_name=f'Type_{data_type}', index=False)
    
    # Append the summary data
    summary_data.append([data_type, len(type_filtered_df)])

# Create the summary DataFrame
summary_df = pd.DataFrame(summary_data, columns=['Data Type', 'Total Rows'])

# Add a header row for both columns in the summary DataFrame
summary_df_header = pd.DataFrame([['Header']], columns=['Data Type'])
summary_df = pd.concat([summary_df_header, summary_df])

# Save the DataFrame to the 'Summary' sheet
summary_df.to_excel(writer, sheet_name='Summary', index=False)

# Access the workbook and summary sheet
workbook = writer.book
summary_sheet = writer.sheets['Summary']

# Merge the cells for the header row
summary_sheet.merge_range('A1:B1', 'Header')

# Reorder the sheets so that the Summary sheet is the first sheet
sheets = workbook.sheetnames
summary_sheet_index = sheets.index('Summary')
sheets.insert(0, sheets.pop(summary_sheet_index))
for i, sheet_name in enumerate(sheets):
    workbook.set_sheet_order(sheet_name, i)

# Save and close the Excel file
writer.save()

答案1

得分: 1

在以下代码之前创建名为Summary的工作表:

# 为每种数据类型创建单独的工作表
writer = pd.ExcelWriter('filtered_data.xlsx', engine='xlsxwriter')
writer.book.add_worksheet('Summary')

# 你的代码
...

# 保存并关闭Excel文件
# writer.save()  # 对于xlsxwriter引擎,我不确定.save() 是否适用
writer.close()

提示:在你的循环中使用 groupby

for data_type, type_filtered_df in filtered_df.groupby('a'):
    # 将筛选后的数据写入Excel文件的新工作表
    type_filtered_df.to_excel(writer, sheet_name=f'Type_{data_type}', index=False)
    
    # 添加汇总数据
    summary_data.append([data_type, len(type_filtered_df)])
英文:

Create the Summary sheet before:

# Create a separate sheet for each type of data
writer = pd.ExcelWriter('filtered_data.xlsx', engine='xlsxwriter')
writer.book.add_worksheet('Summary')

# Your code
...

# Save and close the Excel file
# writer.save()  # I'm not sure about .save() for xlsxwriter engine
writer.close()

Tips: use groupby for your loop:

for data_type, type_filtered_df in filtered_df.groupby('a'):
    # Write the filtered data to a new sheet in the Excel file
    type_filtered_df.to_excel(writer, sheet_name=f'Type_{data_type}', index=False)
    
    # Append the summary data
    summary_data.append([data_type, len(type_filtered_df)])

huangapple
  • 本文由 发表于 2023年7月10日 11:03:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/76650450.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定