Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bijay Dahal {2008/BCT/509} Kabindra Shrestha {2008/BCT/516} Raj Kumar Shrestha {2008/BCT/527}

Similar presentations

Presentation on theme: "Bijay Dahal {2008/BCT/509} Kabindra Shrestha {2008/BCT/516} Raj Kumar Shrestha {2008/BCT/527}"— Presentation transcript:

1 Bijay Dahal {2008/BCT/509} Kabindra Shrestha {2008/BCT/516} Raj Kumar Shrestha {2008/BCT/527}

2  To convert alpha-numeric character from image into normal text form.  To get general idea on image processing.

3 S.NToolsDescription 1JDK 6 Development Kit for JAVA Programming 2NetBeans 7.0 IDE for JAVA Application Development 3Microsoft Windows & Linux OS platforms to Application 4Tortoise SVN Version Control Software for Project Mgmt. 5Sourceforge Project Management and Configuration 6Microsoft Office Documentations

4  Taking image as input.  Converts into normal text form.  Recognizes alpha-numeric characters only.  Edit and Save recognized text. Loaded Image Converted Text Editable

5 Save Text Matrix Matching Feature Extraction Character Segment Line Segment Thinning Binarization Get Image Bold Thin

6  Otsu Binarization Algorithm  Hilditch Skeletonization Algorithm (Thinning)

7  Generic Segmentation

8  Feature Extraction (zonning) Based on Zones 5 horizontal and 5 vertical zones =>25 features Based on Upper and Lower profiles 10 vertical zones => 20 features Based on Left and Right profiles 10 horizontal zones => 20 features Total Number of features 25 + 20 + 20 = 65

9 OFF DAYS: Exam Time: (25 Days) Dashain Holidays: (15 Days) Tihar Holidays: (3 Days)

10 Choosing the correct algorithm. Hard to implement algorithm. Implemented, but output is not accurate. accuracy of matrix matching.

11  Text from image gets converted to text file.  Simplest algorithm; accuracy is about 40%-60%.

12  Can’t recognize text in noisy image.  Can’t detect inclined text from image.  Matrix matching is slow.  Bad thinning & noise makes some text unrecognizable.

13  Scanner image input.  Recognize PDF and other image format.  Nepali / Devnagari font support.  Different fonts.  Output in PDF or Word file format.  Skewing & Noise reduction.  Handwritings.  Neural Network.

14  Bates, K. S. (2010). Head First Java. O'Reilly.  Improving Optical Character Recognition csrs2008-AJPalkovic.PDF csrs2008-AJPalkovic.PDF  Evaluation of OCR Algorithms for Images: &rep=rep1&type=PDF &rep=rep1&type=PDF  Otsu Thresholding - The Lab Book Pages  Image Segmentation  Hilditch Algorithm  Skeletonization on.html on.html  Java OCR | Ron Cemer's Blog development/java-ocr development/java-ocr


Download ppt "Bijay Dahal {2008/BCT/509} Kabindra Shrestha {2008/BCT/516} Raj Kumar Shrestha {2008/BCT/527}"

Similar presentations

Ads by Google