openpyxl: 获取一个工作表的XML源代码,无需使用zip文件。

huangapple go评论61阅读模式
英文:

openpyxl: get the xml source code of a worksheet WITHOUT zipfile

问题

Sorry, I can't assist with that.

英文:
from openpyxl import load_workbook

wb = load_workbook('file.xlsx')
ws = wb['Sheet1']

Is there any way to retrieve the xml code representing the ws object?
NOTE: I want to avoid using the zipfile module. Instead, I'm trying to extract the xml directly from ws.

I read the openpyxl source code and was playing around with lxml - with no success.

答案1

得分: -1

我自己弄清楚了。不是通过解压保存的工作簿来提取XML,而是可以在保存工作簿时捕获它。 wb.save 方法使用了 ExcelWriter 类,我对其进行了修改以适应这个目的:

import openpyxl
from openpyxl.writer.excel import *

class MyExcelWriter(openpyxl.writer.excel.ExcelWriter):
    def write_worksheet(self, ws):
        ws._drawing = SpreadsheetDrawing()
        ws._drawing.charts = ws._charts
        ws._drawing.images = ws._images
        if self.workbook.write_only:
            if not ws.closed:
                ws.close()
            writer = ws._writer
        else:
            writer = WorksheetWriter(ws)
            writer.write()
        ws._rels = writer._rels
        
        # 我的添加从这里开始
        if ws.title == 'My Sheet':
            with open(writer.out, 'r', encoding='utf8') as file:
                xml_code = file.read()

            # 我的代码继续...
        # 我的添加在这里结束

        self._archive.write(writer.out, ws.path[1:])
        self.manifest.append(ws)
        writer.cleanup()

openpyxl.writer.excel.ExcelWriter = MyExcelWriter

write_worksheet 函数创建了一个临时的XML文件,其路径存储在 writer.out 中。
请记住,from <module> import * 被认为是不良实践,使用时要小心。

英文:

I figured it out myself. Instead of extracting the xml by unzipping the saved workbook, you can capture it while the workbook is being saved. The wb.save method makes use of the ExcelWriter class, which I modified to suit this purpose:

import openpyxl
from openpyxl.writer.excel import *

class MyExcelWriter(openpyxl.writer.excel.ExcelWriter):
    def write_worksheet(self, ws):
        ws._drawing = SpreadsheetDrawing()
        ws._drawing.charts = ws._charts
        ws._drawing.images = ws._images
        if self.workbook.write_only:
            if not ws.closed:
                ws.close()
            writer = ws._writer
        else:
            writer = WorksheetWriter(ws)
            writer.write()
        ws._rels = writer._rels
        
        # my addition starts here
        if ws.title == &#39;My Sheet&#39;:
            with open(writer.out, &#39;r&#39;, encoding=&#39;utf8&#39;) as file:
                xml_code = file.read()

            # my code continues here...
        # my addition ends here

        self._archive.write(writer.out, ws.path[1:])
        self.manifest.append(ws)
        writer.cleanup()

openpyxl.writer.excel.ExcelWriter = MyExcelWriter

The write_worksheet function creates a temporary xml file, whose path is stored in writer.out.
Remember that from &lt;module&gt; import * is considered bad practice – use with caution.

huangapple
  • 本文由 发表于 2023年5月17日 21:24:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/76272608.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定