从图像中提取序列号和型号编号

huangapple go评论131阅读模式
英文:

Extracting serial and model number from an image

问题

我是新手数据科学家,被要求从机器面板图像中提取序列号和型号号码,例如这个(其中一个比较清晰的图像):
从图像中提取序列号和型号编号
我已经成功进行了光学字符识别(OCR)处理,但我在思考是否确定特定的文本项是否为型号或序列号会受益于机器学习,还是使用正则表达式匹配更好。

如果这是机器学习可以处理的事情,我应该在哪里找到教程,以帮助我进行指导?

英文:

I'm new to Data Science and have been tasked with extracting the serial and model number from images of machine faceplates, such as this (one of the cleaner images):
从图像中提取序列号和型号编号
I have managed to OCR the text, but am wondering if determining if a particular text item is a model or serial number is something that would benefit from machine learning or if doing regular expression matching would be better.

If this is something machine learning can handle, where can I find tutorials that can help guide me along?

答案1

得分: 1

这个特定的例子看起来足够简单,只需要在OCR提取的文本上使用简单的正则表达式就可以了。通常情况下,试图在这种情况下无处不在地使用“机器学习”并不是一个好主意,这将是一种过度的做法。

话虽如此,如果图像更加“复杂”(例如图像中有更多的文本或数字...),你可能希望使用更高级的方法来解决这个问题,比如命名实体识别或通过训练目标检测/分割模型来进行计算机视觉方法。

英文:

This particular example looks simple enough for just a simple regular expression on the OCR extracted text. It is generally not a good idea to try to use "Machine Learning" everywhere, in this case it would be an overkill.

That being said, if the images were more "complicated" (e.g. more text or numbers in the image...) you might want solve this using more advanced methods such as Named Entity Recognition or with a computer vision approach by training an Object detection/segmentation model.

huangapple
  • 本文由 发表于 2023年8月4日 01:49:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76830492.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定