英文:
Installing a new font in Tesseract on windows
问题
我刚刚在Windows 10上安装了最新版本的Tesseract,但我发现它无法识别seanchló(古老的爱尔兰文字)。幸运的是,有人已经采取了措施来解决这个问题,链接在这里:
https://github.com/kscanne/tesseract-gle-uncial
但似乎我需要一个.traineddata文件,这个文件在那个相对较旧的存储库中似乎不存在。有没有人知道我如何能够提取或生成这个文件,并用它来阅读一些1910年代的文档?
非常感谢!
英文:
I have just installed the most recent version of Tesseract on Windows 10 but I find it does not work with seanchló, or old Irish script. Fortunately someone has done something to address that, here:
https://github.com/kscanne/tesseract-gle-uncial
But it seems I need a .traineddata file, which doesn't appear to be in that fairly old repository. Does anyone know how I might be able to extract or generate this file and use it to read some 1910s-era documents?
Thanks very much!
答案1
得分: 1
我在你提供的存储库的贡献者之一发布了这个版本。看起来它是存储库的一个分支,包含gle_uncial.traineddata
文件。
jimregan/tesseract-gle-uncial/releases
要使用它,只需将它复制到您的tessdata
文件夹所在的位置,并将其作为-l
参数传递给Tesseract OCR。
英文:
I found this release by one of the contributors of the repository you linked. It seems to be a fork of the repo and contains the gle_uncial.traineddata
file.
jimregan/tesseract-gle-uncial/releases
To use it just copy it to where ever your tessdata
folder is and pass it as a -l
argument to Tesseract OCR.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论