问题

我想从PDF文件中的表格中提取文本？

我无法获取表格中的单元格。我尝试运行Leadtools的示例，但它无法自动检测单元格。

> https://www.leadtools.com/help/leadtools/v20/dh/fo/iocrtablezonemanager.html

你能给我建议吗？感谢所有帮助。

英文:

I want to get text from table in PDF file?
如何从PDF文件中的表格中提取文本？

I cannot get cell in table. I was try to run example of Leadtools but it cannot auto detect cell.

> https://www.leadtools.com/help/leadtools/v20/dh/fo/iocrtablezonemanager.html

Can you give me advice? Thanks all

答案1

得分: 0

在类似于您发布的图像的表格中，您应该能够使用IOcrPage.TableZoneManager.AutoDetectCells()方法找到单元格。这个方法在当前版本的LEADTOOLS中附带的OcrMultiEngineDemo项目中使用。

以下是测试方法：

运行OCR多引擎演示。
选择OmniPage OCR引擎。
打开包含表格的图像或PDF文件。
在表格周围绘制一个区域。
从OCR->区域菜单中选择“更新区域...”。
在“更新区域”对话框中，单击如附图所示的“检测单元格”。

如果这没有给出您期望的结果，请将您正在测试的实际文件发送到support@leadtools.com，并解释您的测试方法。

英文:

In tables similar to the image you posted, you should be able to find the cells using the IOcrPage.TableZoneManager.AutoDetectCells() method. This method is used in the OcrMultiEngineDemo project that’s shipped with the current version of LEADTOOLS.

Here’s how you can test it:

Run the OCR Multi-Engine Demo.
Select the OmniPage OCR Engine
Open the image or PDF file that contains the table.
Draw a zone around the table.
Choose “Update Zones…” from the OCR->Zones menu.
In the “Update Zones” dialog, click “Detect Cells” as shown in attached image.

If this doesn’t give you the result you’re expecting, send the actual files you’re testing with to support@leadtools.com and explain how you tested exactly.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何从PDF文件中的表格中提取文本？

问题

答案1

如何在Visual Studio中的一个项目中运行多个C#程序

Unity – 创建新数组时出现空引用异常错误

如何在 asp.net 中传递可选的多个值到 URL 中？

DownloadFileCompleted 在文件下载完成之前被执行

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论