如何使用JavaScript获取应用于Excel文件中单元格的验证?

huangapple go评论61阅读模式
英文:

How to get the validations applied on a cell in an Excel file using javascript?

问题

ExcelJs包中有一个名为dataValidation或_dataValidation的单元格对象属性。它提供了单元格中数据的类型,如列表、下拉菜单、小数、布尔等。

但是,出于某些原因,对于某些文件,它会引发JavaScript堆内存溢出错误。因此,我已经将内存限制增加到了16GB(这是我可以使用的最大值),通过使用node --max-old-space-size=[size_in_GB*1024] index.js命令,但它仍然无法解析。

因此,我正在寻找类似sheet js等替代包,但是我无法像在ExcelJs中那样获取所有数据验证,主要是Excel文件中下拉菜单的选项。

请帮助我解决这个问题。谢谢。

PS:我已经查看了这个问题https://stackoverflow.com/questions/72019281/how-to-get-the-datatype-of-excel-cell-value-using-nodejs-version16-14-2-and-vue,而convert-excel-to-json不提供有关单元格的任何验证。

在解析特定的Excel文件时,我遇到了以下错误。

如何使用JavaScript获取应用于Excel文件中单元格的验证?

这是我使用的代码片段,之后的几分钟内我收到了上述错误。

const ExcelJS = require('exceljs');
const fs = require("fs");

async function readExcelFile(filePath) {
  const workbook = new ExcelJS.Workbook();
  const stream = fs.createReadStream(filePath);

  await workbook.xlsx.read(stream);

  const worksheet = workbook.getWorksheet(1);

  worksheet.eachRow((row, rowNumber) => {
    
      row.eachCell((cell, colNumber) => {
        console.log(cell.dataValidations, cell._dataValidations);
        console.log(`Cell value at row ${rowNumber}, column ${colNumber}: ${cell.value}`);
      });
    });
}

readExcelFile('PAG KIC TAB A (1)(1).xlsx');

希望这可以帮助你解决问题。

英文:

In ExcelJs package there is dataValidation or _dataValidation attribute for cell object. This gives the type of data in cell like list, dropdown, decimal, boolean etc.

But, for some reason, it is throwing Javascript out of heap memory error for some files, So, I have increased the memory limit to 16GB (it is the max I can use) by node --max-old-space-size=[size_in_GB*1024] index.js but it failed to parse.

So, I am looking for alternative packages like sheet js and others but I coun't get all the data validations as in ExcelJs mainly options in a dropdown of an Excel file.

Please help me to solve this issue. Thank you.

PS: I have checked this question https://stackoverflow.com/questions/72019281/how-to-get-the-datatype-of-excel-cell-value-using-nodejs-version16-14-2-and-vue and the convert-excel-to-json don't provide any validations on the cell.

I am getting this error while parsing the particular Excel file.

如何使用JavaScript获取应用于Excel文件中单元格的验证?

This is the code snippet that I have used and I got the above error after few minutes.

const ExcelJS = require('exceljs');
const fs = require("fs");

async function readExcelFile(filePath) {
  const workbook = new ExcelJS.Workbook();
  const stream = fs.createReadStream(filePath);

  await workbook.xlsx.read(stream);

  const worksheet = workbook.getWorksheet(1);

  worksheet.eachRow((row, rowNumber) => {
    
      row.eachCell((cell, colNumber) => {
        console.log(cell.dataValidations, cell._dataValidations);
        console.log(`Cell value at row ${rowNumber}, column ${colNumber}: ${cell.value}`);
      });
    });
}

readExcelFile('PAG KIC TAB A (1)(1).xlsx');

答案1

得分: 0

我看了一下为什么会发生OOM,并且罪魁祸首实际上是一个已定义的名称MS文档ExcelJS文档)来自名为"[31]IRRs UPDATE EACH QUARTER"的工作表。它引用了一个庞大的范围"C6:XFD1048576",ExcelJS会逐个单元格地遍历,从而耗尽所有内存。难怪你的工作簿在MS Office云中无法打开。

现在,根据您所需的处理方式,您可以检查这个有问题的工作表,看看是否有意义保留这样庞大范围的已定义名称,然后修复/删除它并重试。

或者,您可以使用ExcelJS的流API,像这样:

import ExcelJS from "exceljs";

async function readExcelFile(filePath: string) {
  const workbookReader = new ExcelJS.stream.xlsx.WorkbookReader(filePath, {});

  for await (const worksheetReader of workbookReader) {
    for await (const row of worksheetReader) {
      row.eachCell((cell) => {
        console.log(cell.value);
      });
    }
  }
}

readExcelFile("PAG KIC TAB A (1)(1).xlsx");

就像一般的流一样,它不会在内存中缓存所有数据,而是一次只处理一批数据,因此在内存使用上更高效。但我也发现这种读取方式根本不处理已定义的名称,这对你来说可能是不可接受的。

英文:

I had a look why the OOM happens and the culprit is actually a defined name (MS docs, ExcelJS docs) from a sheet named "[31]IRRs UPDATE EACH QUARTER". It references a huge range "C6:XFD1048576" which ExcelJS goes through cell by cell and by doing so uses up all memory. No wonder your workbook doesn't open in MS Office cloud.

Now depending on what kind of processing you need, you can either inspect that offending sheet whether this defined name with such huge range makes even sense, fix/remove it and try again.

Or you can use ExcelJS's Streaming API, like this:

import ExcelJS from "exceljs";

async function readExcelFile(filePath: string) {
  const workbookReader = new ExcelJS.stream.xlsx.WorkbookReader(filePath, {});

  for await (const worksheetReader of workbookReader) {
    for await (const row of worksheetReader) {
      row.eachCell((cell) => {
        console.log(cell.value);
      });
    }
  }
}

readExcelFile("PAG KIC TAB A (1)(1).xlsx");

As is the case with streaming in general, it doesn't buffer all data in memory, but rather only processes one batch of data at a time, and is therefore much more efficient with memory usage. But I also found that this way of reading doesn't process the defined names at all, which can be a no-go for you.

huangapple
  • 本文由 发表于 2023年5月22日 13:22:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76303224.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定