home go links go books go opinion go gallery go projects go resumé go
about this site
archives
book reviews
"to read" list
tech books
search books
books archive
last 10 posts
quotes
cluetrain
cluetrain (mirrored)
randobracket
image auth
search engine hits
  hit history
indexer stats
user agent list
HTML (view)
  (most up-to-date)
MS Word (dl)
code examples
doesntsuck.com
doesntsuck.com

July 28, 2004

open source ocr links   (link)

http://www.linux-ocr.ekitap.gen.tr/
ocr link collection

http://tides.umiacs.umd.edu/description.html
Rapidly Retargetable Translingual Detection
The objective of this project is to rapidly create usable systems for translingual document detection that can be employed by analysts who are fluent in English to detect potentially important documents that are written in other languages.

http://www.claraocr.org/
Clara OCR is a free (GPL) OCR for systems that support the C library and the X windows system (e.g. most flavours of Unix). Clara OCR is intended for large scale digitalization projects. It features a powerful GUI and a web interface for cooperative digitalization of books.

http://www.gnu.org/software/ocrad/ocrad.html
GNU Ocrad is an OCR (Optical Character Recognition) program implemented as a filter and based on a feature extraction method. It reads a bitmap image in pbm format and produces text in byte (8-bit) or UTF-8 formats. Also includes a layout analyser able to separate the columns or blocks of text normally found on printed pages. Ocrad can be used as a stand-alone console application, or as a backend to other programs.

http://jocr.sourceforge.net/download.html
GOCR is an OCR (Optical Character Recognition) program, developed under the GNU Public License. It converts scanned images of text back to text files. Joerg Schulenburg started the program, and now leads a team of developers.
GOCR can be used with different front-ends, which makes it very easy to port to different OSes and architectures. It can open many different image formats, and its quality have been improving in a daily basis.

Posted by yargevad at July 28, 2004 11:06 AM


This weblog is licensed under a Creative Commons License.