Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Monadic-Memoized Solution for Left-Recursion Problem of Combinatory Parser Rahmatullah Hafiz 60-520 Fall, 2005.

Similar presentations


Presentation on theme: "A Monadic-Memoized Solution for Left-Recursion Problem of Combinatory Parser Rahmatullah Hafiz 60-520 Fall, 2005."— Presentation transcript:

1 A Monadic-Memoized Solution for Left-Recursion Problem of Combinatory Parser Rahmatullah Hafiz 60-520 Fall, 2005

2 Outline Part1: Basic Concepts Part2: Related Works Part3: Our Solution

3 Parsing  Process of determining if input string can be recognized by a set of rules for a specific language  Parser = Program that does parsing Input “2*5+6” Rules exp -> digit op Exp |digit digit -> 0|1|..|9 Op-> *|+ exp digit exp Top-Down Parsing digit 2 op *5+6

4 Top-Down Parsing  Recognition of Input starts at root and proceeds towards leaves  Left to Right recognition  Recursive-decent (back-tracking) parsing  If one rule fails then try another rule recursively  Comparatively easy to construct  Exponential in worst-case

5 Combinatory Parser  Parsers written in functional languages  Lazy-Functional Languages (Miranda, Haskell)  Can be used to parse  Natural Language (English)  Formal Language (Haskell)  Need to follow some rules  Context-Free Grammar

6 Why Lazy-Functional Language  Modular code and easy to implement  Higher-Order functions can represent BNF notations of CFG  Functions are First-Class citizens  Suitable for Top-Down Recursive-Decent fully backtracking parsing Higher-Order functions  Input/Output arguments could be another function(s)

7 *Frost and Launchbury (JFP, 1989)—NLI in Miranda *Hutton (JFP, 1992 )— Uses of Combinatory Parser Example BNF notation of CFG s ::= a s|empty We can read it like “s is either a then s or empty” => s = a then s or empty **Possible to write parsers exactly like this in LFL using-higher order functions

8 Example (cont.) --s :: a s|empty empty input = [input] a (x:xs) = if x==‘a’ then xs else [] (p `or` q ) input = p input ++ q input (p `then` q) input = if r == [] then [] else q r where r = p input s input = (a `then` s `or` empty) input --*Main> s "aaa“ --["","a","aa","aaa"]

9 Problem with Left-Recursive Grammar (“aa”, “a”) Right-recursive grammar s :: a s|a Input “aaa” s as a s a (“a”, “aa”) (“aaa”, “”) Terminates Left-recursive grammar s :: s a|a Input “aaa” a aaa aa a s s a s a Never Terminates s aaa “”

10 Example --s :: s a|empty empty input = [input] a (x:xs) = if x==‘a’ then xs else [] (p `or` q ) input = p input ++ q input (p `then` q) input = if r == [] then [] else q r where r = p input s input = (s `then` a `or` empty) input --*Main> s "aaa“ --(*** Exception: stack overflow Never Terminates

11 Why bother for Left-Recursive Parsing?  Easy to implement  Very modular  In case of Natural Language Processing  Expected ambiguity can be achieved easily  Non-left Recursive grammars might not generates all possible parses  As left recursive grammar’s root grows all the way to the bottom level, it ensures all possible parsing

12 Related Approaches  Approach leads to exponential time complexity Lickman (1995) – Fixed point solution  Most of the approaches deals with Transforming left-recursive grammar to non-left recursive grammar Frost (1992) – Guarding left-production with non- left productions Hutton (1992, 1996) – Grammar transformation

13 Related Approaches Transforming left-recursive grammar to non-left recursive grammar  violates semantic rules  structure of parse trees are completely different Example s::sa s::bs` [rule 1] [rule ??] |b s`::as`|empty [rule 2] [rule ??] s s a s b a s b S` a empty a

14 Our Approach: Monadic-Memoization * Frost and Hafiz (2005)  The idea is simple Let not the parse tree grow infinitely s s a s a Parser “fails” when Depth >= length s aaa a Left-recursive grammar s :: s a |a Input “aaa” Depth=1 Length=3 Depth=2 Length=3 Depth=3 Length=3 *Wadler (1985)-- How to replace failure by a list of successes

15 Monadic-Memoization (cont.) s s a s a  As the recognizer fails, the production rule ‘s::sa’ fails too  Control goes to upper level with ‘failure’  ‘backtracking’, parser tries ‘alternatives’ s aaa a Left-recursive grammar s :: s a |a Input “aaa” Depth=1 Length=3 Depth=2 Length=3 Depth=3 Length=3  If Alternative rule succeeds  Control goes to upper level with left un-recognized inputs  Recursive procedure

16 Monadic-Memoization (cont.) s Left-recursive grammar s :: s a |a Input “aaa” Depth=1 Length=3 Depth=2 Length=3 Depth=3 Length=3 [fail]  s::sa fails but s::a succeeds  “backtrack ing” s a s a s “” aa [fail] a a

17 Monadic-Memoization (cont.)  This approach is applicable in Mixed-Environment  Grammar may contain  Left-recursive production  non Left-recursive production s :: s a | a a :: b a | b  During parsing execution of one rule for same input may occur multiple time  Also need to keep track of input length and depth Top-Down Back-trucking is Exponential

18 Monadic-Memoization (cont.)  Memoization is helpful  The idea is Checking a ‘Memo’ table along with input to each recursive call  ‘Memo’ table contains List of previous parsed outputs for any input, paired with appropriate production Length and depth of current parse  Before parsing any input if “lookup to Memo table fails” then “perform parsing & update the memo table” else “return the result from table”

19 Monadic-Memoization (cont.) s s a b a “aa” Production ‘a’ “lookups” the memo table =(["","a","aa"],[(“a",[("aa",["","a","aa"]),("a",["","a"]),(10,11)])]) LR Production keeps track of length and depth Production ‘a’ “updates” the memo table

20 Monadic-Memoization (cont.)  Memoization reduces worst-case time complexity from exponential to O(n 3 )  The problem is Lazy-functional languages don’t let variable updating or keeping a global storage for the whole program  Need to pass around the ‘Memo’ table so that  All recursive parsing calls access ‘Memo’ table  if ‘Memo’ table is used as the function arguments  Code gets messy and error-prone

21 Monadic-Memoization (cont.) Or we can use ‘Monad’ *Derived from “Category Theory” -- Moggi (1989) *S.E approaches for LFLs -- Wadler (1990)  State, exception, I/O etc of LFL *Monadic Framework for Parsing –-Frost (2003)  Reusable  Complex tasks could be achieved by adding/modifying existing monadic objects  Structured computation

22 Monadic-Memoization (cont.) Monad is a triple (M, unit, bind)  ‘ M’ type constructor memo = [([Char],[([Char],[[Char]])],[(Int,Int)])] M inp = memo -> (inp, memo)  ‘unit’ :: aM a takes a value and returns the computation of the value Works as a ‘container’  ‘bind’ :: M a  (a  M b)  M b applies the computation ‘a  M b’ to the computation ‘M a’ and returns a computation ‘M b’ Ensures sequential computation

23 Monadic-Memoization (cont.) The mental picture of ‘how monad works’  Monad is a triple (M, unit, bind) M = Loader unit = tray bind = combiner *Picture source Newbern (2003)

24 Monadic-Memoization (cont.)  Transform combinatory parsers into Monadic object Example Original “Or” recognizer (p `or` q ) inp = p inp ++ q inp Monadic version (p `or` q) inp = p inp `bindS` f where f m = q inp`bindS`g where g n = unitS(nub(m ++ n)) Monadic Object ‘Or’ S::a or b Input = “ab” a b Memo1 “ab” Memo2 Parsed out update Memo Check LR lookup Memo

25 Monadic-Memoization (cont.) s s a b a “aa” Production ‘a’ “updates” the memo table Production ‘a’ “lookups” the memo table =(["","a","aa"],[(“a",[("aa",["","a","aa"]),("a",["","a"]),(10,11)])]) LR Production keeps track of length and depth “aa” Memo table propagation is ensured correctly => O(n 3 )

26 References 1.Frost, R. A. and Launchbury, E. J. (1989) Constructing natural language interpreter in a lazy functional language. The computer Journal – Special edition on Lazy functional Programming, 32(2) 3-4 2.Wadler, P. (1992) Monads for functional programming. Computer and systems sciences, Volume 118 3.Hutton, G. (1992) Higher-order functions for parsing. Journal of Functional Programming, 2(3):323-343, Cambridge University Press 4.Frost, R. A. (1992)Guarded attribute grammars: top down parsing and left recursive productions. SIGPLAN Notices 27(6): 72-75 5.Lickman, P. (1995) Parsing With Fixed Points. Masters thesis. Oxford University 6.Frost, R.A., Szydlowski, B. (1996) Memoizing Purely Functional Top- Down Backtracking Language Processors. Sci. Comput. Program. 27(3): 263-288 7.Hutton, G. (1998) Monadic parsing in Haskell. Journal of Functional Programming, 8(4):437-444, Cambridge University Press 8.Frost, R.A.(2003) Monadic Memoization towards Correctness-Preserving Reduction of Search. Canadian Conference on AI 2003: 66-80


Download ppt "A Monadic-Memoized Solution for Left-Recursion Problem of Combinatory Parser Rahmatullah Hafiz 60-520 Fall, 2005."

Similar presentations


Ads by Google