Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology.

Similar presentations


Presentation on theme: "1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology."— Presentation transcript:

1 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

2 2 RIT Document and Pattern Recognition Lab (DPRL) Goals: 1.Improve theory and tools for constructing and evaluating pattern recognition systems 2.Apply these to problems in document recognition and pen- based computing Members: Richard Zanibbi Kurt Kluever (Master’s student) New members welcome! http://www.cs.rit.edu/~rlaz/dprl.html

3 3 Current Directions: 1.Theory and Tools: Tools for recognition module integration and evaluation, such as the Recognition Strategy Language (Zanibbi et al.) Game-theoretic models of recognition problems and systems (e.g. for classifier combination) Machine learning algorithms for system optimization 2. Applications: Pen and image-based math entry (lab maintains open- source Freehand Formula Entry System(Smithies, Novins, Arvo, Zanibbi et al.) Optical character recognition (OCR) Image and text-based document retrieval “CAPTCHAs” (for distinguishing humans from 'bots’) Table recognition, etc.

4 4 Interpretive Interfaces for Math Entry

5 5 Pen-Based Math Entry Recognition Challenges Large number (e.g. > 500 in LaTeX) of symbols, many similar in structure (e.g. 0 and O) Layout of symbols on baselines can be ambiguous Little redundancy Context influences symbol identity and layout interpretation

6 6 Example: Freehand Formula Entry System/DRACULAE Contributors: FFES first developed as an MSc project at University of Otago (Smithites, Novins), New Zealand, using CIT tools of Jim Arvo et al. in 1998 Since then, contributors from Queen’s University (CA), Concordia University (CA), and around the world (CMU, UC Berkley, Companies and non-profits in California and France)

7 7 DRACULAE (Zanibbi, 2002) “Diagram Recognition Application for Computer Understanding of Large Algebraic Expressions”

8 8 DRACULAE: Layout Classes for Symbols Symbol name defines class membership.

9 9 DRACULAE Layout Analysis: Sketch Algorithm: 1.Symbols assigned layout type (class) based on symbol identity 2.Sort symbols left-right on leftmost edge of Bounding Box 3.Create baseline structure tree with region node “Expression” 4.Recursively: a)Search right-to-left, locate the leftmost (“start”) baseline (dominance rules for symbol layout class pairs) b)From start symbol, search left-right in symbol list for symbols adjacent on baseline (**Zhang: fuzzy version) c)Add baseline symbols as children of parent region node d)Place non-baseline symbols in lists associated with region nodes (e.g. for super/subsc/bleft etc.) e)Apply a-d to each new region, until no new regions created

10 10 Expanding the View… Integration of scanned and pen-based expressions Infty system, FFES prototype (impl. Josh Zimler 2006) Long Term Goal: Flexible input and combination Allow one to easily combine and then reformat/interpret LaTeX, eqn, etc. MATLAB, Mathematica, etc. Handwritten expressions (tablet/mouse) Scanned images of handwritten or typeset expressions “Vector drawing” interface input, e.g. as in Xpress (Pollanen et al.)

11 11 Other Math Entry Interfaces Natural Log by Matsakis, Miller, and Viola (MIT) JIMHR: (Java-Based) Interactive Math Handwriting Recognizer, a merge and port of FFES/DRACULAE and the Natural Log system by Joy-Gong Ho (Acuitus Corp., USA) JMathNotes by Ernesto Tapia Rodriguez (Free University of Berlin) Infty by M. Suzuki et. al. (Kyushu University, Japan) MathJournal by XThink Inc: first commercial pen-based math recognition system MathPad by Joseph LaViola Links available: http://www.cs.rit.edu/~rlaz

12 12 The Recognition Strategy Language (RSL)

13 13 Motivation: A high-level language for pattern recognition algorithms Table Recognition Survey (Zanibbi et al. 2004) Summarizes literature in terms of observations, transformations, and inferences. Techniques studied characterized as making the follow types of inferences (decisions): Parameter values (e.g. thresholds) Interpretation Model Operations: –Segmentation (identifying regions of interest in data) –Classification (assigning types to regions) –Relating regions (e.g. topology (adjacencies)) –Rejecting segments, classes, and region relationships (Unanswered) Question: How should we combine recognition modules in a complex math entry system?

14 Example: Simple Table Structure Recognition Algorithm (Part 1) model regions Image Word Cell % default: ’ Region ’ Row Column end regions model relations % default: ’ contains ’ adjacent_right adjacent_below end relations recognition parameters sMaxRowSeparation 2 % millimetres sMaxColumnSeparation 2 % millimetres aResolution 300 % dpi; default end parameters

15 15 strategy main adapt aResolution using getScanResolution() observing {Image} regions classify {Word} regions as {Cell} relate {Cell} regions with {adjacent_right} using defineRightAdjacency(sMaxRowSeparation,aResolution) segment {Cell} regions into {Row} regions using relationClosure() observing {adjacent_right} relations relate {Cell} regions with {adjacent_below} using defineLowerAdjacency(sMaxColSeparation,aResolution) segment {Cell} regions into {Column} regions using relationClosure() observing {adjacent_below} relations accept interpretations end strategy Trivial Decision Observation Specification External Decision Function Decision type Decision Function Parameters Input: Params, Graph with Image, Word regions (BBs) Output: Cells, Rows, Cols

16 1.Translate RSL Program to TXL (Using TXL) 2.Pass Input Graph (text file) to Program 3.Output (text files): Accepted Structures (interpretations) Log of all decisions and their outcomes Running RSL Programs

17 17 False Negatives ( F ) Generated Hypotheses: ( A U R ) Recognition Targets: Correct Hypotheses New Metrics Based on Hypothesis Histories: Historical Recall and Precision

18 Recall4/8 (50.0%)2/8 (25.0%)8/8 (100.0%) Precision4/12 (33.3%)2/5 (40.0%)8/8 (100.0%) Historical Recall4/8 (50.0%)6/8 (66.7%)8/8 (100.0%) Historical Precision4/12 (33.3%)6/17 (35.3%)8/19 (42.1%) Hypothesis History

19 19 Cell Detection Results (Handley, 2001) RSL Re-implementation on Table ‘a038’ (UW-III) *Inference times shown are those affecting cells 0: Input (words and lines) 1: Classify words as cells 16: Merge ‘horizontally close’ cells 35: Merge cells sharing column, row assignments. Nearly 50% of correct cells rejected; new correct cells also detected 47: Two cells merged producing column header ‘Total pore space (percent)’ 51: Merge header cells bounded by two horizontal lines 83: Merge cells sharing line and white space separators

20 20 RSL and Math Entry Proposal: “MIN” System New interface for math entry and offline experiments Use RSL to define recognition strategies, capture results. (Really): testbed for studying recognition algorithms and their intelligent combination, organization, and deployment in practice. Goals: Compare different approaches to recognizing mathematical expressions (from input to output) represented in RSL Allow flexible training, combination, and alteration of various recognition strategies. Extend RSL to accommodate math and other problem domains more effectively, while remaining abstract

21 21 (Some) Relevant Journals and Conferences Journals IEEE Trans. Pattern Analysis and Machine Intelligence Machine Learning Pattern Recognition Pattern Recognition Letters Artificial Intelligence Int’l J. Document Analysis and Recognition … Conferences Int’l Conf. Machine Learning IEEE Computer Vision and Pattern Recognition Computational Learning Theory (COLT) Int’l Conf. Document Analysis and Recognition Int’l Work. Document Analysis Systems …

22 22 Thank you. Questions? Support: GCCIS Department of Computer Science


Download ppt "1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology."

Similar presentations


Ads by Google