Presentation is loading. Please wait.

Presentation is loading. Please wait.

21 October 2000 MathML & Math on the Web Illinois D-Lib Testbed: Technologies for Converting Legacy Mathematics for Display on the Web Timothy W. Cole.

Similar presentations


Presentation on theme: "21 October 2000 MathML & Math on the Web Illinois D-Lib Testbed: Technologies for Converting Legacy Mathematics for Display on the Web Timothy W. Cole."— Presentation transcript:

1 21 October 2000 MathML & Math on the Web Illinois D-Lib Testbed: Technologies for Converting Legacy Mathematics for Display on the Web Timothy W. Cole Thomas G. Habing William H. Mischo Grainger Engineering Library Information Center University of Illinois at Urbana-Champaign   http://dli.grainger.uiuc.edu/Publications/MathMLConf/ thabing@uiuc.edu

2 21 October 2000 MathML & Math on the Web Project Background & Objectives Funded 1994-98 under DLI-I (NSF, DARPA, & NASA) Continued 1998-2001 under CNRI’s D-Lib Test Suite Objectives: –Construct Large-Scale, Multipublisher, Markup-Based Full-Text Journal Testbed. –Investigate Processing, Indexing, Normalization, Retrieval, Rendering and Linking. –Study End-User Searching Behavior and Needs. Testbed contains 60,000 Articles from 50 Journal Titles –Received as SGML (various DTDs); converted to XML –Content & support from AIP, APS, ASCE, IEE, ASM, ACM, Elsevier –Additional support from IEEE, NRL, NTT Learning Systems

3 21 October 2000 MathML & Math on the Web Project Background (cont.) Accomplishments: –Process & Retrieve from Multiple Publishers & Heterogeneous DTDs. –SGML to XML Conversion. –Metadata Extraction, Representation, Merging. –Dynamic Linking: Forward/Backward, from/to A & I DBs. Current Investigations: –Mathematics Markup & Rendering Issues –Metadata Harvesting: Replicative & Distributed –E-Journal Archiving –Local Resource Resolution –Asynchronous Searching of Multiple Resources

4 21 October 2000 MathML & Math on the Web Converting Legacy Markup to MathML Goal: Convert publisher-specific XML math markup to standard presentation MathML –Desired result: can then focus on single rendering solution Groundrules: –Minimize need for human intervention –Utilize standards-based techniques (e.g., XSLT, JavaScript, DOM) –Embed MathML in full XML document –Validate success of conversion based on quality of presentation –Strive for consistency across MathML viewers Scope: –E.g. in 17,000 APS articles, > 2.3 M instances of math (100 K block) –  http://dli.grainger.uiuc.edu/MathMLStyle/math_sample.htm 

5 21 October 2000 MathML & Math on the Web Mathematics Markup Transformations Identify & translate mathematical character references Identify & tokenize mathematical content Recognize & transform mathematical markup (e.g., embellishments, script & limit schemtas, etc.) ISO 12083 Math a 2 i Presentational MathML α i 2

6 21 October 2000 MathML & Math on the Web Approach & Algorithim For each XML document: Identify mathematical nodes (e.g.,, ) Recursively apply templates to every child node within mathematical nodes: –Look up entities & special characters and Convert to appropriate MathML characters & tokenize (JavaScript) –Tokenize remaining #PCDATA (JavaScript) –Convert Postfix markup to MathML (e.g.,, ) –Re-tag one-to-one transformations (e.g.,,, ) Transformed mathematical nodes ( ) replace original mathematical nodes in document –Include default namespace attribute

7 21 October 2000 MathML & Math on the Web Approach & Algorithim (cont.) Illustrative XSLT:... THERE ARE FOUR MORE CASES TO HANDLE !

8 21 October 2000 MathML & Math on the Web Remaining Issues JavaScript from within XSLT –Rely on MS-specific mechanisms to invoke extension functions Inconsistent Rendering by MathML Viewers –Validating against TechExplorer, Amaya, Mozilla, MS IE (w/ CSS) –Incomplete MathML implementations Ambiguity & Overuse of –Limited impact on appearance –Verbosity -- 60% increase for inline, 15% increase for block Character / glyph issues –STIX project / Unicode update will provide some relief Automated Checking for Errors / Problems Rendering System Performance

9 21 October 2000 MathML & Math on the Web Status Developing publisher-specific XSLT stylesheets –See sample transformed issue of Physical Review Letters   XSLT allows us to generate standard MathML from publisher-dependent SGML math markup –Moves customization to pre-processing stage –Allows for single, common rendering solution –MathML can be rendered in some browsers / tools without the need to style (Mozilla, techexplorer, Mathematica)


Download ppt "21 October 2000 MathML & Math on the Web Illinois D-Lib Testbed: Technologies for Converting Legacy Mathematics for Display on the Web Timothy W. Cole."

Similar presentations


Ads by Google