To convert alpha-numeric character from image into normal text form. To get general idea on image processing.
S.NToolsDescription 1JDK 6 Development Kit for JAVA Programming 2NetBeans 7.0 IDE for JAVA Application Development 3Microsoft Windows & Linux OS platforms to Application 4Tortoise SVN Version Control Software for Project Mgmt. 5Sourceforge Project Management and Configuration 6Microsoft Office Documentations
Taking image as input. Converts into normal text form. Recognizes alpha-numeric characters only. Edit and Save recognized text. Loaded Image Converted Text Editable
Save Text Matrix Matching Feature Extraction Character Segment Line Segment Thinning Binarization Get Image Bold Thin
Feature Extraction (zonning) Based on Zones 5 horizontal and 5 vertical zones =>25 features Based on Upper and Lower profiles 10 vertical zones => 20 features Based on Left and Right profiles 10 horizontal zones => 20 features Total Number of features = 65
Choosing the correct algorithm. Hard to implement algorithm. Implemented, but output is not accurate. accuracy of matrix matching.
Text from image gets converted to text file. Simplest algorithm; accuracy is about 40%-60%.
Can’t recognize text in noisy image. Can’t detect inclined text from image. Matrix matching is slow. Bad thinning & noise makes some text unrecognizable.
Scanner image input. Recognize PDF and other image format. Nepali / Devnagari font support. Different fonts. Output in PDF or Word file format. Skewing & Noise reduction. Handwritings. Neural Network.
Bates, K. S. (2010). Head First Java. O'Reilly. Improving Optical Character Recognition csrs2008-AJPalkovic.PDF csrs2008-AJPalkovic.PDF Evaluation of OCR Algorithms for Images: &rep=rep1&type=PDF &rep=rep1&type=PDF Otsu Thresholding - The Lab Book Pages Image Segmentation Hilditch Algorithm Skeletonization on.html on.html Java OCR | Ron Cemer's Blog development/java-ocrhttp://www.roncemer.com/software- development/java-ocr