使用openpyxl向现有的Excel表格添加新列的方法

huangapple go评论69阅读模式
英文:

How to use openpyxl to add a new column to an existing excel table

问题

我有一个包含现有表格和引用表格中数据的公式的模板电子表格。

作为模板,表格以一行空数据开始。

我希望向表格添加数据(行和列),但保留工作中的公式。

有一篇出色的帖子这里描述了如何向现有表格添加行。这在我的代码中有效。

第二篇帖子这里尝试通过用新表格替换表格来添加列。不幸的是,这会破坏公式。

以下是我的代码片段:

sSheetName = 'test'
workbook = openpyxl.load_workbook(excelDumpFilename)
sheet = workbook[sSheetName]
dfCSVData = pd.read_csv(os.path.join(sCSVLocation, sCSVFile))
sheetRow = 2

# 循环遍历CSV数据中的每一行
for CSVrow in dataframe_to_rows(dfCSVData, header=False, index=False):
    sheetCol = 2
    sheet.insert_rows(sheetRow, 1)

    # 循环遍历数据行中的每个值
    for CSVvalue in CSVrow:
        sheet.cell(row=sheetRow, column=sheetCol).value = CSVvalue
        sheetCol += 1
    sheetRow += 1

table = sheet.tables[sSheetName]
currentTitleCount = len(table.column_names)
tableRows = dfCSVData.shape[0]
tableCols = dfCSVData.shape[1]

# 更新非通用字段的表格标题
for index in range(currentTitleCount, tableCols):
    sheet.cell(row=2, column=tableColStart+index).value = dfCSVData.columns[index]
    table.column_names.append(dfCSVData.columns[index])

newRange = openpyxl.worksheet.cell_range.CellRange(
    min_row=tableRowStart,
    min_col=tableColStart,
    max_row=tableRowStart + tableRows,
    max_col=tableColStart + tableCols - 1
)
tTable.ref = newRange.coord
tTable.autoFilter.ref = newRange.coord

workbook.save(excelDumpFilename)

dfCSVData 包含我的新表格数据,但其维度比模板表格中的要大。

您可以看到,我更新了模板表格以包含扩展的列名。

我还尝试直接更改表格的列名:

table.column_names.append(dfCSVData.columns[index])

但这无法进行任何更改(在调试模式下检查)。

使用 tableRows 扩展了我的表格的长度,这有效。

使用 tableCols(一个较大的数字)会损坏Excel文件,下次打开表格时Excel会完全删除表格。

有什么帮助吗?谢谢!

英文:

I have a template spreadsheet with eixisting tables and formula referring to data in the tables.

As a template, the tables start with one row of empty data.

I wish to add data (rows and columns) to the tables but retain working formula.

There is an excellent post here describing how to add rows to en existing table. This is working in my code.

A second post here attempts to add columns to a table by replacing the table with a new one. Unfortunately this breaks the formula.

Here's a snipit of my code:

     sSheetName = 'test'
     workbook = openpyxl.load_workbook(excelDumpFilename)
     sheet=workbook[sSheetName]
     dfCSVData = pd.read_csv(os.path.join(sCSVLocation,sCSVFile))
     sheetRow = 2
     # loop around each row in the CSV data
     for CSVrow in dataframe_to_rows(dfCSVData, header=False, index=False):
         sheetCol = 2
         sheet.insert_rows(sheetRow, 1)
         # loop around each value of the data row
         for CSVvalue in CSVrow:
             sheet.cell(row=sheetRow, column=sheetCol).value=CSVvalue
             sheetCol+=1
         sheetRow+=1
     table    = sheet.tables[sSheetName]
     currentTitleCount = len(table.column_names) 
     tableRows = dfCSVData.shape[0]
     tableCols = dfCSVData.shape[1]
     # Update table titles for non-generic fields
     for index in range(currentTitleCount, tableCols):
         sheet.cell(row=2, column=tableColStart+index).value=dfCSVData.columns[index]
         table.column_names.append( dfCSVData.columns[index] )
     newRange=openpyxl.worksheet.cell_range.CellRange(
         min_row=tableRowStart,
         min_col=tableColStart,
         max_row=tableRowStart + tableRows,
         max_col=tableColStart + tableCols - 1
         #max_col=tableColStart + currentTitleCount - 1
     )
     tTable.ref=newRange.coord
     tTable.autoFilter.ref=newRange.coord
 workbook.save(excelDumpFilename)

dfCSVData contains my new table data but the dimensions are larger than the table in the template sheet.

You can see that I update the template sheet with extended column names.

I also attempt to alter the table column_names directly:
table.column_names.append( dfCSVData.columns[index] )
but this fails to make any changes (examined under debug).

Using tableRows extends the length of my table. This is working.

Using tableCols (a larger number) corrupts the Excel file and Excel removes the table entirely the next time you open the sheet.

Help? Thanks

答案1

得分: 0

就像我在评论中所说的,你应该真的简化示例,比如将表格从A1:D4扩展到A1:F6并提供一些示例数据。这样可以让人们更容易帮助你。

table.column_names是一个属性,它只返回列名列表。这就是为什么更改它没有效果的原因。你需要更新table.TableColumns。由于你正在更新工作表,你有两种选择。首先,使用私有方法table._initialise_columns(),或者通过编辑或扩展table.TableColumns来手动执行此操作。值得看一下openpyxl源代码,了解它是如何工作的。

不幸的是,table.TableColumns必须与工作表完全匹配。这使得在openpyxl中编辑表格变得更加复杂,因为表格实际上不是工作表对象(这是在OOXML中的实现方式)。

英文:

Like I said in my comment, you should really simplify the example to, say, increase a table from A1:D4 to A1:F6 and provide some sample data. This makes it a lot easier for people to help.

table.column_names is a property that just returns a list of column names. That's why changing this has no effect. You need to update the table.TableColumns. As you are updating the worksheet you have to choices. Firstly, use the private method table._initialise_columns(), or do this manually by editing or extending table.TableColumns. It's worth looking at the openpyxl source to see how this works.

Unfortunately, it's essential that table.TableColumns exactly matches the worksheet. This makes it trickier in openpyxl to edit the table because the table isn't really a worksheet object (this is how it's implemented in OOXML).

huangapple
  • 本文由 发表于 2023年8月9日 15:04:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76865349-2.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定