Apache POI在表头单元格包含换行符时生成损坏的文件

huangapple go评论67阅读模式
英文:

Apache POI createTable generates corrupted file when a header's cell contains a line break

问题

以下是翻译好的内容:

我正在使用 Apache POI 4.1.2 在 Java 中创建 Excel 文件。我有一段代码,用于从现有单元格创建表格,之前一切都运行正常,直到我在标题单元格内部有一个换行。

我尝试在之后更改表格的列名,但没有解决任何问题。

以下是重现问题的最小代码片段:

    public void test() throws IOException {
        XSSFWorkbook wb = new XSSFWorkbook();
        XSSFSheet sheet = wb.createSheet();

        // 标题
        XSSFRow headersRow = sheet.createRow(0);
        headersRow.createCell(0).setCellValue("Column1");
        headersRow.createCell(1).setCellValue("Column2");

        // 第二行
        XSSFRow row = sheet.createRow(1);
        row.createCell(0).setCellValue(1);
        row.createCell(1).setCellValue(2);

        // 创建表格
        AreaReference area = wb.getCreationHelper().createAreaReference(
                new CellReference(sheet.getRow(0).getCell(0)),
                new CellReference(sheet.getRow(1).getCell(1))
        );
        XSSFTable table = sheet.createTable(area);

        // 样式(这里没有问题)
        sheet.setColumnWidth(0, 5000);
        sheet.setColumnWidth(1, 5000);
        CTTable cttable = table.getCTTable();
        cttable.addNewTableStyleInfo();
        XSSFTableStyleInfo style = (XSSFTableStyleInfo) table.getStyle();
        style.setName("TableStyleMedium6");
        style.setShowColumnStripes(false);
        style.setShowRowStripes(true);
        cttable.addNewAutoFilter().setRef(area.formatAsString());
        CellStyle cellStyle = wb.createCellStyle();
        cellStyle.setWrapText(true);
        headersRow.getCell(0).setCellStyle(cellStyle);

        // 这个文件是正常的
        try (FileOutputStream outputStream = new FileOutputStream("C:\\tmp\\test.xlsx")) {
            wb.write(outputStream);
        }

        // 在标题单元格中添加一个换行
        headersRow.getCell(0).setCellValue("Column1\n带换行的内容");
        // 这个文件有问题
        try (FileOutputStream outputStream = new FileOutputStream("C:\\tmp\\test2.xlsx")) {
            wb.write(outputStream);
        }

        // 这并没有解决问题
        table.getColumns().get(0).setName("Column1");
        try (FileOutputStream outputStream = new FileOutputStream("C:\\tmp\\test3.xlsx")) {
            wb.write(outputStream);
        }

        // 这也没有解决问题
        cttable.getTableColumns().getTableColumnList().get(0).setName("Column1");
        try (FileOutputStream outputStream = new FileOutputStream("C:\\tmp\\test4.xlsx")) {
            wb.write(outputStream);
        }
    }

Excel 正确加载 text.xlsx,但会对其他所有文件产生投诉:<br>
我们发现一些内容有问题...
<br><br>
Excel 修复这些文件后,一切正常,但我想要摆脱警告消息。

对任何帮助表示感谢。

英文:

I am using Apache POI 4.1.2 to create Excel files in Java. I have a piece of code that creates a table from existing cells and everything used to work fine, untill I had a linebreak inside a header's cell.

I tried to change the table's column name afterward but it didn't fix anything.

Below is a minimal piece of code to reproduce the problem:

    public void test() throws IOException {
XSSFWorkbook wb = new XSSFWorkbook();
XSSFSheet sheet = wb.createSheet();
// headers
XSSFRow headersRow = sheet.createRow(0);
headersRow.createCell(0).setCellValue(&quot;Column1&quot;);
headersRow.createCell(1).setCellValue(&quot;Column2&quot;);
// a second row
XSSFRow row = sheet.createRow(1);
row.createCell(0).setCellValue(1);
row.createCell(1).setCellValue(2);
// create a table
AreaReference area = wb.getCreationHelper().createAreaReference(
new CellReference(sheet.getRow(0).getCell(0)),
new CellReference(sheet.getRow(1).getCell(1))
);
XSSFTable table = sheet.createTable(area);
// styling (no problem here)
sheet.setColumnWidth(0, 5000);
sheet.setColumnWidth(1, 5000);
CTTable cttable = table.getCTTable();
cttable.addNewTableStyleInfo();
XSSFTableStyleInfo style = (XSSFTableStyleInfo) table.getStyle();
style.setName(&quot;TableStyleMedium6&quot;);
style.setShowColumnStripes(false);
style.setShowRowStripes(true);
cttable.addNewAutoFilter().setRef(area.formatAsString());
CellStyle cellStyle = wb.createCellStyle();
cellStyle.setWrapText(true);
headersRow.getCell(0).setCellStyle(cellStyle);
// this file is OK
try (FileOutputStream outputStream = new FileOutputStream(&quot;C:\\tmp\\test.xlsx&quot;)) {
wb.write(outputStream);
}
// add a line break in a header&#39;s cell
headersRow.getCell(0).setCellValue(&quot;Column1\nwith a line break&quot;);
// this file has a problem
try (FileOutputStream outputStream = new FileOutputStream(&quot;C:\\tmp\\test2.xlsx&quot;)) {
wb.write(outputStream);
}
// this doesn&#39;t fix anything
table.getColumns().get(0).setName(&quot;Column1&quot;);
try (FileOutputStream outputStream = new FileOutputStream(&quot;C:\\tmp\\test3.xlsx&quot;)) {
wb.write(outputStream);
}
// neither does this
cttable.getTableColumns().getTableColumnList().get(0).setName(&quot;Column1&quot;);
try (FileOutputStream outputStream = new FileOutputStream(&quot;C:\\tmp\\test4.xlsx&quot;)) {
wb.write(outputStream);
}
}

Excel loads text.xlsx properly, but complains about all other files:<br>
We found a problem with some content...
<br><br>
After Excel fixes the files, everything is OK but I would like to get rid of the warning message.

Any help will be appreciated.
Thanks

答案1

得分: 1

这是 XSSFTable.updateHeaders 存在问题的地方。在写入表格的 XML 时会调用此方法。这是因为表格列名必须始终与单元格内容同步。例如,如果单元格内容为 "Column1",并且该单元格是表格的列标题,则该表格的列名也必须为 "Column1"(XML:<tableColumn id="1" name="Column1"/>)。

但是对于列标题中的换行符,有一个特殊情况。如果单元格内容为 "Column1\nwith a line break",并且该单元格是表格的列标题,则该表格的列名必须为 XML 形式 &lt;tableColumn id="1" name="Column1_x000a_with a line break"/>。因此,"\n" 会被替换为 "x000a"。同样,"\r" 必须被替换为 "x000d"。这是因为 "\r\n" 换行符在 XML 中不具有换行的含义。

因此,XSSFTable.java - updateHeaders 必须被修补,使 "\n" 被替换为 "x000a","\r" 被替换为 "x000d"。

...
    public void updateHeaders() {
        XSSFSheet sheet = (XSSFSheet)getParent();
        CellReference ref = getStartCellReference();
        if (ref == null) return;

        int headerRow = ref.getRow();
        int firstHeaderColumn = ref.getCol();
        XSSFRow row = sheet.getRow(headerRow);
        DataFormatter formatter = new DataFormatter();

        if (row != null && row.getCTRow().validate()) {
            int cellnum = firstHeaderColumn;
            CTTableColumns ctTableColumns = getCTTable().getTableColumns();
            if(ctTableColumns != null) {
                for (CTTableColumn col : ctTableColumns.getTableColumnList()) {
                    XSSFCell cell = row.getCell(cellnum);
                    if (cell != null) {
                        String colName = formatter.formatCellValue(cell);
                        colName = colName.replace("\n", "_x000a_");
                        colName = colName.replace("\r", "_x000d_");
                        col.setName(colName);
                    }
                    cellnum++;
                }
            }
        }
        tableColumns = null;
        columnMap = null;
        xmlColumnPrs = null;
        commonXPath = null;
    }
...

由于在执行 XSSFWorkbook.write 时会调用 XSSFTable.updateHeaders,没有其他方法可以在此过程中更改表格的 XML,因此唯一的方法是修补此方法。

英文:

This is an inaccuracy with XSSFTable.updateHeaders. This method gets called while the table's XML gets written. This is because the table column names always must be synchronized with the cell contents. For example if the cell content is "Column1" and this cell is a column header of a table, then this tables column name also must be "Column1" (XML: &lt;tableColumn id=&quot;1&quot; name=&quot;Column1&quot;/&gt;).

But for line feeds in column headers, there is a specialty. If the cell content is "Column1\nwith a line break" and this cell is a column header of a table, then this tables column name must be XML as &lt;tableColumn id=&quot;1&quot; name=&quot;Column1_x000a_with a line break&quot;/&gt;. So "\n" is replaced by "x000a". Also "\r" would must be replaced by "x000d". This is because "\r\n" line breaks will not have the meaning of line break in XML.

So XSSFTable.java - updateHeaders would must be patched that way that "\n" gets replaced by "x000a" and "\r" gets replaced by "x000d".

...
public void updateHeaders() {
XSSFSheet sheet = (XSSFSheet)getParent();
CellReference ref = getStartCellReference();
if (ref == null) return;
int headerRow = ref.getRow();
int firstHeaderColumn = ref.getCol();
XSSFRow row = sheet.getRow(headerRow);
DataFormatter formatter = new DataFormatter();
if (row != null &amp;&amp; row.getCTRow().validate()) {
int cellnum = firstHeaderColumn;
CTTableColumns ctTableColumns = getCTTable().getTableColumns();
if(ctTableColumns != null) {
for (CTTableColumn col : ctTableColumns.getTableColumnList()) {
XSSFCell cell = row.getCell(cellnum);
if (cell != null) {
String colName = formatter.formatCellValue(cell);
colName = colName.replace(&quot;\n&quot;, &quot;_x000a_&quot;);
colName = colName.replace(&quot;\r&quot;, &quot;_x000d_&quot;);
col.setName(colName);
}
cellnum++;
}
}
}
tableColumns = null;
columnMap = null;
xmlColumnPrs = null;
commonXPath = null;
}
...

Since XSSFTable.updateHeaders gets called while the table's XML gets written while XSSFWorkbook.write, there is no other way than patching this method. One does not have any chance to change table's XML while XSSFWorkbook.write.

huangapple
  • 本文由 发表于 2020年9月9日 22:49:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/63814180.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定