Skip to main content

Search

Items tagged with: tesseract


@meatbag I'm on linux and the best I have found working for me is #ocrmypdf https://github.com/ocrmypdf/OCRmyPDF
It uses #tesseract under the hood and for static text it's okay. For tables and other material that is difficult to parse it's not usefull.
When PDF has a text then the tools I am using for reading these include #firefox and #evince


I'm using #ocrDesktop that uses #tesseract under the hood and it works fine. Of course there are some inaccuracies here and there but for checking VM status, reading teamviewer password and similar tasks it's enough. Here is a wiki article about it. https://wiki.archlinux.org/title/Ocrdesktop


@Jamie Teh @Asa Dotzler Since tesseract v4 accuracy is greatly improved. For adding text layer to scanned pdf files I am using an app called #ocrmypdf which uses #tesseract and it works sufficiently well for me. There are even some unofficial builds of tesseract for #android. Other than self packaged tesseract there is mlkit by google people are using on android for recognizing images. Performance and interfacing with tesseract may not be that great, I understand.