Search

Items tagged with: tesseract


I have just found a nice document scanning app for android that can do automatic edge detection, cropping, multipage scanning, OCR, PDF export and more.
It's called #makeacopy and it's using #tesseract engine to perform the OCR directly on the device with no internet connectivity requirement at all.
The app has almost full #a11y support for screen reader users in the sense that all the controls are clearly labelled and it's easy to navigate.
I can't resist and I have asked the developer if it would be doable to add a screen reader compatible notifications making the automatic edge detection somehow accessible as well.
Now I'd appreciate comments from low vision screen reader users, mobility trainers, people assisting other blind people or others who might be able to tell if my idea is viable and how much you like it?
Here is link to the github issue I have started: github.com/egdels/makeacopy/is…

Thanks for looking into it.


@modulux @David Gerard For making missing text part for scanned PDF files and OCR-ing other images I like to use a python based command line app called ocrmypdf which uses #tesseract as a dependency. It runs on linux and windows.


@meatbag I'm on linux and the best I have found working for me is #ocrmypdf github.com/ocrmypdf/OCRmyPDF
It uses #tesseract under the hood and for static text it's okay. For tables and other material that is difficult to parse it's not usefull.
When PDF has a text then the tools I am using for reading these include #firefox and #evince


I'm using #ocrDesktop that uses #tesseract under the hood and it works fine. Of course there are some inaccuracies here and there but for checking VM status, reading teamviewer password and similar tasks it's enough. Here is a wiki article about it. wiki.archlinux.org/title/Ocrde…


@Jamie Teh @Asa Dotzler Since tesseract v4 accuracy is greatly improved. For adding text layer to scanned pdf files I am using an app called #ocrmypdf which uses #tesseract and it works sufficiently well for me. There are even some unofficial builds of tesseract for #android. Other than self packaged tesseract there is mlkit by google people are using on android for recognizing images. Performance and interfacing with tesseract may not be that great, I understand.