iLayout: Performance Evaluation
Dataset Sample Images
Architecture Output Text Traditional OCR OCR using i-Layout Document image OCR Output Text Traditional OCR Document Image Page Segmentation by i-Layout OCR Combining output from blocks Output Text OCR using i-Layout
Evaluation Metrics Evaluation Metrics Intersection/Union based Error wise Reading Order Based Penalty Goal oriented (OCR accuracy) Goal: Systematic and detailed analysis of layout performance at every stage Error wise quantitative and qualitative analysis
Intersection/Union based measure
Error based evaluation Can we quantify each errors individually?
Error based evaluation
Error based evaluation
Evaluation Measure Score Over-segmentation score 0.0984 Under-segmentation score 0.2923 False Alarm Score 0.5640 Missing Score 0.0045 Error based performance evaluation for i-Layout on 100 pages (Telugu book)
Such splits do no affect OCR accuracy, thus should be less penalized Less Penalty Such splits do no affect OCR accuracy, thus should be less penalized
Goal oriented evaluation Language #Pages OCR Accuracy before iLayout OCR Accuracy after iLayout Char Word Telugu 50 45.98 4.84 70.14 46.97 Comparison of OCR accuracy before & after page-segmentation using i-Layout
OCR Accuracy after iLayout Evaluation on failed pages Evaluated on 130 images, previously reported as failures by CDAC-Noida. Language #Pages OCR Accuracy after iLayout Char Word Telugu 100 61.66 14.51 Hindi 30 68.87 46.07
Visual Results Consortium OCR i-Layout
Visual Results Consortium OCR i-Layout