How to use openpyxl to add a new column to an existing excel table

huangapple go评论83阅读模式
英文:

How to use openpyxl to add a new column to an existing excel table

问题

我有一个带有现有表格和引用表格数据的公式的模板电子表格。

作为模板,表格以一行空数据开始。

我希望向表格中添加数据(行和列),但保留工作中的公式。

这里有一篇很好的帖子(链接:https://stackoverflow.com/questions/48657867/manipulate-existing-excel-table-using-openpyxl),描述了如何向现有表格中添加行。这在我的代码中有效。

第二篇帖子(链接:https://stackoverflow.com/questions/68560984/how-to-insert-new-column-in-the-exist-openpyxl-table/73207385#73207385)尝试通过替换表格来添加列。不幸的是,这会破坏公式。

这是我的代码片段:

     sSheetName = 'test'
     workbook = openpyxl.load_workbook(excelDumpFilename)
     sheet=workbook[sSheetName]
     dfCSVData = pd.read_csv(os.path.join(sCSVLocation,sCSVFile))
     sheetRow = 2
     # 循环遍历CSV数据中的每一行
     for CSVrow in dataframe_to_rows(dfCSVData, header=False, index=False):
         sheetCol = 2
         sheet.insert_rows(sheetRow, 1)
         # 循环遍历数据行中的每个值
         for CSVvalue in CSVrow:
             sheet.cell(row=sheetRow, column=sheetCol).value=CSVvalue
             sheetCol+=1
         sheetRow+=1
     table    = sheet.tables[sSheetName]
     currentTitleCount = len(table.column_names) 
     tableRows = dfCSVData.shape[0]
     tableCols = dfCSVData.shape[1]
     # 更新非通用字段的表格标题
     for index in range(currentTitleCount, tableCols):
         sheet.cell(row=2, column=tableColStart+index).value=dfCSVData.columns[index]
         table.column_names.append( dfCSVData.columns[index] )
     newRange=openpyxl.worksheet.cell_range.CellRange(
         min_row=tableRowStart,
         min_col=tableColStart,
         max_row=tableRowStart + tableRows,
         max_col=tableColStart + tableCols - 1
         #max_col=tableColStart + currentTitleCount - 1
     )
     tTable.ref=newRange.coord
     tTable.autoFilter.ref=newRange.coord
 workbook.save(excelDumpFilename)

dfCSVData包含我的新表格数据,但尺寸比模板表格大。

你可以看到我用扩展的列名更新了模板表格。

我还尝试直接修改表格的column_names:
table.column_names.append( dfCSVData.columns[index] )
但这没有做出任何更改(在调试中检查过)。

使用tableRows扩展了我的表格长度。这个方法有效。

使用tableCols(一个更大的数字)会损坏Excel文件,并且Excel在下次打开表格时会完全删除表格。

需要帮助吗?谢谢。

英文:

I have a template spreadsheet with eixisting tables and formula referring to data in the tables.

As a template, the tables start with one row of empty data.

I wish to add data (rows and columns) to the tables but retain working formula.

There is an excellent post here describing how to add rows to en existing table. This is working in my code.

A second post here attempts to add columns to a table by replacing the table with a new one. Unfortunately this breaks the formula.

Here's a snipit of my code:

     sSheetName = 'test'
     workbook = openpyxl.load_workbook(excelDumpFilename)
     sheet=workbook[sSheetName]
     dfCSVData = pd.read_csv(os.path.join(sCSVLocation,sCSVFile))
     sheetRow = 2
     # loop around each row in the CSV data
     for CSVrow in dataframe_to_rows(dfCSVData, header=False, index=False):
         sheetCol = 2
         sheet.insert_rows(sheetRow, 1)
         # loop around each value of the data row
         for CSVvalue in CSVrow:
             sheet.cell(row=sheetRow, column=sheetCol).value=CSVvalue
             sheetCol+=1
         sheetRow+=1
     table    = sheet.tables[sSheetName]
     currentTitleCount = len(table.column_names) 
     tableRows = dfCSVData.shape[0]
     tableCols = dfCSVData.shape[1]
     # Update table titles for non-generic fields
     for index in range(currentTitleCount, tableCols):
         sheet.cell(row=2, column=tableColStart+index).value=dfCSVData.columns[index]
         table.column_names.append( dfCSVData.columns[index] )
     newRange=openpyxl.worksheet.cell_range.CellRange(
         min_row=tableRowStart,
         min_col=tableColStart,
         max_row=tableRowStart + tableRows,
         max_col=tableColStart + tableCols - 1
         #max_col=tableColStart + currentTitleCount - 1
     )
     tTable.ref=newRange.coord
     tTable.autoFilter.ref=newRange.coord
 workbook.save(excelDumpFilename)

dfCSVData contains my new table data but the dimensions are larger than the table in the template sheet.

You can see that I update the template sheet with extended column names.

I also attempt to alter the table column_names directly:
table.column_names.append( dfCSVData.columns[index] )
but this fails to make any changes (examined under debug).

Using tableRows extends the length of my table. This is working.

Using tableCols (a larger number) corrupts the Excel file and Excel removes the table entirely the next time you open the sheet.

Help? Thanks

答案1

得分: 0

就像我在评论中说的那样,你应该简化示例,比如将表格从A1:D4扩展到A1:F6并提供一些示例数据。这样可以让人们更容易帮助你。

table.column_names是一个返回列名列表的属性。这就是为什么更改它没有效果。你需要更新table.TableColumns。由于你正在更新工作表,你有两种选择。首先,使用私有方法table._initialise_columns(),或者通过编辑或扩展table.TableColumns来手动完成。值得看一下openpyxl源代码,了解它是如何工作的。

不幸的是,table.TableColumns必须与工作表完全匹配。这在openpyxl中使得编辑表格变得更加棘手,因为表格实际上不是一个工作表对象(这是它在OOXML中的实现方式)。

英文:

Like I said in my comment, you should really simplify the example to, say, increase a table from A1:D4 to A1:F6 and provide some sample data. This makes it a lot easier for people to help.

table.column_names is a property that just returns a list of column names. That's why changing this has no effect. You need to update the table.TableColumns. As you are updating the worksheet you have to choices. Firstly, use the private method table._initialise_columns(), or do this manually by editing or extending table.TableColumns. It's worth looking at the openpyxl source to see how this works.

Unfortunately, it's essential that table.TableColumns exactly matches the worksheet. This makes it trickier in openpyxl to edit the table because the table isn't really a worksheet object (this is how it's implemented in OOXML).

huangapple
  • 本文由 发表于 2023年8月9日 15:04:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76865349.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定