Google Document AI模型未能读取JSON格式的文档。

huangapple go评论64阅读模式
英文:

Google Document AI model not reading document in JSON

问题

这是输出:

{ document: { text: 'Order -POS178463\nP\nTo\nDelivery Address\nPOSTURITE\nOcushield Ltd\nDoor 14 (060) -Posturite Goods In\nAccount No.\nOCU01\nDhruvin Patel\nTVS SCS Rico\nOrder No.\nPOS178463\nLaunch Lab, Floor 4\n215 Park Lane\nOrder Date\n24/04/2023\n124 Goswell Road\nMinworth\nCust Ref.\nLondon\nB35 6LJ\nEC1V 7DP\nUnited Kingdom\nUnited Kingdom\nProduct\nDescription\nQty\nUnit Price\nAmount\nOCUVDU27BZ\nAnti Blue Light Privacy Filter 27" W (16:9)\n10\n£45.32\n£453.20\nContact:\nGoods Total\n£453.20\nCust Ref:\nDelivery Notes:\nPosturite Limited\nt. +44 (0) 345 345 0010\ne. purchaseconfirmation@posturite.co.uk\nwww.posturite.co.uk\nB\nISO 9001\nISO 14001\nISO 27001\nISO 45001\nISOQAR\nUKAS\nENSTEMS\n0026\nThe Mill, Berwick, East Sussex BN26 6SZ, UK\nRegistered in England No. 2574809\nCertificate Number 5312' },
  humanReviewStatus: { state: 'SKIPPED' } }

我期望获得JSON格式的结构化键值对输出,不确定为什么它不起作用。

英文:

I have been trying out the various processors (form parser, document OCR and the specialized ones). I am testing it on some purchase order PDFs and therefore using the "purchase order" processor. For some reason, the PDF is scanned and parsed through the processor, but the JSON output is not in structured key value pairs.

Here is the output:

{ document: { text: 'Order -POS178463\nP\nTo\nDelivery Address\nPOSTURITE\nOcushield Ltd\nDoor 14 (060) -Posturite Goods In\nAccount No.\nOCU01\nDhruvin Patel\nTVS SCS Rico\nOrder No.\nPOS178463\nLaunch Lab, Floor 4\n215 Park Lane\nOrder Date\n24/04/2023\n124 Goswell Road\nMinworth\nCust Ref.\nLondon\nB35 6LJ\nEC1V 7DP\nUnited Kingdom\nUnited Kingdom\nProduct\nDescription\nQty\nUnit Price\nAmount\nOCUVDU27BZ\nAnti Blue Light Privacy Filter 27" W (16:9)\n10\n£45.32\n£453.20\nContact:\nGoods Total\n£453.20\nCust Ref:\nDelivery Notes:\nPosturite Limited\nt. +44 (0) 345 345 0010\ne. purchaseconfirmation@posturite.co.uk\nwww.posturite.co.uk\nB\nISO 9001\nISO 14001\nISO 27001\nISO 45001\nISOQAR\nUKAS\nENSTEMS\n0026\nThe Mill, Berwick, East Sussex BN26 6SZ, UK\nRegistered in England No. 2574809\nCertificate Number 5312\n' },
  humanReviewStatus: { state: 'SKIPPED' } }

I expect as structured key value pair output in JSON and unsure why it does not work.

答案1

得分: 0

看起来 Document JSON 只包括 text 字段。

您是否在请求中发送了 fieldMask?这将限制在 Document 对象响应中返回的字段。有关如何处理带有和不带有 fieldMask 的文档的信息,请参阅文档中的 发送处理请求,以及有关如何从输出的 Document 中提取数据的信息,请参阅 处理处理响应

对于表单解析器,键值对将位于 Document.pages.formFields 字段中,而对于所有实体提取处理器,如采购订单解析器,它们将位于 entities 字段中。

英文:

It looks like the Document JSON is only including the text field.

Are you sending a fieldMask with your request? This will limit the fields that are returned in the Document object response. Refer to send a processing request in the documentation for how to process a document with and without a fieldMask, and refer to handle the processing response for how to extract the data from the output Document.

For the Form Parser the key-value pairs will be in the Document.pages.formFields field, and for all Entity Extraction processors, such as the Purchase order parser, they will be in the entities field.

huangapple
  • 本文由 发表于 2023年6月5日 22:54:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76407681.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定