Skip to main content

Search

Items tagged with: ocrmypdf


@meatbag I'm on linux and the best I have found working for me is #ocrmypdf https://github.com/ocrmypdf/OCRmyPDF
It uses #tesseract under the hood and for static text it's okay. For tables and other material that is difficult to parse it's not usefull.
When PDF has a text then the tools I am using for reading these include #firefox and #evince


@Jamie Teh @Asa Dotzler Since tesseract v4 accuracy is greatly improved. For adding text layer to scanned pdf files I am using an app called #ocrmypdf which uses #tesseract and it works sufficiently well for me. There are even some unofficial builds of tesseract for #android. Other than self packaged tesseract there is mlkit by google people are using on android for recognizing images. Performance and interfacing with tesseract may not be that great, I understand.