在Windows上安装新字体到Tesseract中。

huangapple go评论49阅读模式
英文:

Installing a new font in Tesseract on windows

问题

我刚刚在Windows 10上安装了最新版本的Tesseract,但我发现它无法识别seanchló(古老的爱尔兰文字)。幸运的是,有人已经采取了措施来解决这个问题,链接在这里:

https://github.com/kscanne/tesseract-gle-uncial

但似乎我需要一个.traineddata文件,这个文件在那个相对较旧的存储库中似乎不存在。有没有人知道我如何能够提取或生成这个文件,并用它来阅读一些1910年代的文档?

非常感谢!

英文:

I have just installed the most recent version of Tesseract on Windows 10 but I find it does not work with seanchló, or old Irish script. Fortunately someone has done something to address that, here:

https://github.com/kscanne/tesseract-gle-uncial

But it seems I need a .traineddata file, which doesn't appear to be in that fairly old repository. Does anyone know how I might be able to extract or generate this file and use it to read some 1910s-era documents?

Thanks very much!

答案1

得分: 1

我在你提供的存储库的贡献者之一发布了这个版本。看起来它是存储库的一个分支,包含gle_uncial.traineddata文件。

jimregan/tesseract-gle-uncial/releases

要使用它,只需将它复制到您的tessdata文件夹所在的位置,并将其作为-l参数传递给Tesseract OCR。

英文:

I found this release by one of the contributors of the repository you linked. It seems to be a fork of the repo and contains the gle_uncial.traineddata file.

jimregan/tesseract-gle-uncial/releases

To use it just copy it to where ever your tessdata folder is and pass it as a -l argument to Tesseract OCR.

huangapple
  • 本文由 发表于 2023年6月19日 01:16:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/76501730.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定