Linux OCR: A review of free optical character recognition software
I've used Linux as my full-time desktop for seven years now. I have almost no reason to use Windows (other than stupid ExamSoft), and even when I do, I don't have much Windows software available. The one "hole" in my workflow has been OCR. For years, people have been able to scan a document and have it converted into real text. One of my old printers even came with OCR software included - for Windows of course. But when I've really needed OCR, I've just assumed that there were no high quality packages available for Linux.
Recently I decided to find out for myself (a complete OCR virgin) what is available, how to use it, and what the results are like. I installed every free OCR package I could find, and systematically tested them. They all work very differently, so I tried to design a simple test for my specific needs.
As a test subject, I entered the following line into a word processor: