英文:
Updating existing Excel file with Pandas and Openpyxl throws an AttributeError: property 'book' of 'OpenpyxlWriter' object has no setter
问题
我一直在尝试更新现有Excel中的数据,过程如下:
-
从Excel读取数据
-
使用pandas将其与新数据合并
-
将合并后的数据帧保存到原始文件中
它一直返回以下错误,我认为这是由于在读取文件时写入(更新)同一文件导致的:
AttributeError Traceback (most recent call last)
Cell In[14], line 19
18 with pd.ExcelWriter(file_path, engine='openpyxl') as writer:
---> 19 writer.book = book
20 writer.sheets = {ws.title: ws for ws in book.worksheets}
AttributeError: property 'book' of 'OpenpyxlWriter' object has no setter
我的代码在这里 - 运行它还会擦除原始文件的数据并使文件无法使用:
# Load the Excel file
file_path = 'original.xlsx'
update = 'update.xlsx'
# Open the file in read-only mode to prevent any locks
with open(file_path, "rb") as file:
book = load_workbook(file)
# Combine original with update file
for sheet_name in ['sheet1', ...]:
df1 = pd.read_excel(file_path, sheet_name=sheet_name)
df2 = pd.read_excel(update, sheet_name=sheet_name)
df2 = df2.iloc[::-1]
df1 = pd.concat([df1, df2], ignore_index=True)
df1 = df1.drop_duplicates(subset='column1', keep='last')
# Write combined data to the sheet
with pd.ExcelWriter(file_path, engine='openpyxl') as writer:
writer.book = book
writer.sheets = {ws.title: ws for ws in book.worksheets}
# Set the sheet as the active sheet
book.active = book.sheetnames.index(sheet_name)
df1.to_excel(writer, sheet_name=sheet_name, index=False, startrow=1)
print(f"Successfully updated '{sheet_name}' sheet in '{file_path}'.")
请注意,这是你的原始代码的中文翻译部分。
英文:
I have been trying to update data in an existing Excel--the process as follows:
1)read the data from Excel
2)combine it with a new data using pandas
3)save the combined dataframe into the original file
It keeps return the following error, which I think it is coming from writing(updating) the same file while reading it:
AttributeError Traceback (most recent call last)
Cell In[14], line 19
18 with pd.ExcelWriter(file_path, engine='openpyxl') as writer:
---> 19 writer.book = book
20 writer.sheets = {ws.title: ws for ws in book.worksheets}
AttributeError: property 'book' of 'OpenpyxlWriter' object has no setter
My code is here--running it also erases the data of the original file and make the file unusable:
# Load the Excel file
file_path = 'original.xlsx'
update = 'update.xlsx'
# Open the file in read-only mode to prevent any locks
with open(file_path, "rb") as file:
book = load_workbook(file)
# Combine original with update file
for sheet_name in ['sheet1', ...]:
df1 = pd.read_excel(file_path, sheet_name = sheet_name)
df2 = pd.read_excel(update, sheet_name = sheet_name)
df2 = df2.iloc[::-1]
df1 = pd.concat([df1, df2], ignore_index = True)
df1 = df1.drop_duplicates(subset = 'column1', keep = 'last')
# Write combined data to the sheet
with pd.ExcelWriter(file_path, engine='openpyxl') as writer:
writer.book = book
writer.sheets = {ws.title: ws for ws in book.worksheets}
# Set the sheet as the active sheet
book.active = book.sheetnames.index(sheet_name)
df1.to_excel(writer, sheet_name = sheet_name, index = False, startrow = 1)
print(f"Successfully updated '{sheet_name}' sheet in '{file_path}'.")
答案1
得分: 0
这是因为 book
是只读属性。
由于您只是在更新文件,可以尝试使用 Pandas 提供的附加模式和 if_sheet_exists
标志,示例文档请查看:docs
结果将类似于以下内容:
# 将原始数据与更新文件合并
for sheet_name in ['sheet1', 'sheet2']:
df1 = pd.read_excel(file_path, sheet_name=sheet_name)
df2 = pd.read_excel(update, sheet_name=sheet_name)
df2 = df2.iloc[::-1]
df1 = pd.concat([df1, df2], ignore_index=True)
df1 = df1.drop_duplicates(subset='column1', keep='last')
# 将合并后的数据写入工作表
with pd.ExcelWriter(file_path, mode='a', if_sheet_exists='replace', engine='openpyxl') as writer:
df1.to_excel(writer, sheet_name=sheet_name, index=False, startrow=0)
print(f"成功更新了 '{sheet_name}' 工作表在 '{file_path}' 中。")
英文:
This is because book
is read-only property.
Since you're only updating a file, you may try append mode with flags a
and if_sheet_exists
, provided by Pandas: docs
Result will look similar to that:
# Combine original with update file
for sheet_name in ['sheet1', 'sheet2']:
df1 = pd.read_excel(file_path, sheet_name=sheet_name)
df2 = pd.read_excel(update, sheet_name=sheet_name)
df2 = df2.iloc[::-1]
df1 = pd.concat([df1, df2], ignore_index=True)
df1 = df1.drop_duplicates(subset='column1', keep='last')
# Write combined data to the sheet
with pd.ExcelWriter(file_path, mode='a', if_sheet_exists='replace', engine='openpyxl') as writer:
df1.to_excel(writer, sheet_name=sheet_name, index=False, startrow=0)
print(f"Successfully updated '{sheet_name}' sheet in '{file_path}'.")
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论