Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scheme in Python. Overview  We’ll look at how to implement a simple scheme interpreter in Python  This is based on the Scheme in Scheme interpreter.

Similar presentations


Presentation on theme: "Scheme in Python. Overview  We’ll look at how to implement a simple scheme interpreter in Python  This is based on the Scheme in Scheme interpreter."— Presentation transcript:

1 Scheme in Python

2 Overview  We’ll look at how to implement a simple scheme interpreter in Python  This is based on the Scheme in Scheme interpreter we studied before  We’ll look at pyscheme 1.6, which was implemented by Danny Yoo as an undergraduate at Berkeley  Since Python doesn’t optimize for tail recursion, he uses trampolining, which we’ll introduce

3 What we need to do  A representation for Scheme’s native data structures Pairs (aka, cons cells), symbols, strings, numbers,Booleans  A reader that converts a stream of characters into a stream of s-expressions We’ll introduce an intervening step reading characters and converting to tokens  Implement various built-ins e.g., cons, car, +, …

4 What we won’t need to do We can rely on Python for a number of very useful things Representing numbers and strings Garbage collection Low level I/O

5 atoms  Atoms include strings, number, and symbols  We’ll use Python’s native representation for string and numbers  Symbols in Scheme are interned – there is a unique object for each symbol read  This is how they differ from strings, which are not interned Note: some Lisp implemen- tations intern small integers

6 Symbols # A global dictionary that contains all known symbols __INTERNED_SYMBOLS = {} class __Symbol(str): """A symbol is just a special kind of string""" def __eq__(self, other): return self is other def symbol(s): """"Returns symbol given string, creating new ones if needed””” global __interned_symbols if s not in __INTERNED_SYMBOLS: __INTERNED_SYMBOLS[s] = __Symbol(s) return __INTERNED_SYMBOLS[s] # Here are definitions of symbols that we should know scheme_false = symbol("#f") scheme_true = symbol("#t") __empty_symbol = Symbol("") def isSymbol(s): return type(s) == type(__empty_symbol)

7 GCing Unused Symbols  If the only reference to a symbol is from the global list of interned symbols, it can be garbage collected  We’ll use Python’s weakref’s for this  A weak reference is a reference that doesn’t protect an object from garbage collection  Objects referenced only by weak references are considered unreachable (or "weakly reachable") and may be collected at any time

8 using weakrefs import weakref from UserString import UserString as __UserStr … __INTERNED_SYMBOLS = weakref.WeakValueDictionary({}) … class __Symbol(__UserStr): … if s not in __INTERNED_SYMBOLS: # make a temp strong reference newSymbol = __Symbol(s) __INTERNED_SYMBOLS[s] = newSymbol return __INTERNED_SYMBOLS[s]

9 Representing pairs  The core of scheme only has one kind of data structure – lists– and it is made up out of pairs  What Python types should we use? A user defined class, Pair Lists Tuples Dictionary Closures

10 Aside: pairs as closures  Functions are very powerful  We can use them to represent cons cells or pairs  We don’t want to do this in practice  But it shows the power of programming with functions

11 (define (mycons theCar theCdr) ;; mycons returns a closure that takes a 2-arg function and applies ;; it to the two remembered vlue's, i.e., the pair's car and cdr. (lambda (f) (f theCar theCdr))) (define (mycar cell) ;; mycar takes a pair closure and feeds it a 2-arg function that ;; just returns the first arg (cell (lambda (theCar theCdr) theCar))) (define (mycdr cell) ;; mycdr takes a pair closure and feeds it a 2-arg function that ;; just returns the first arg (cell (lambda (theCar theCdr) theCdr))) (define myempty ;; the empty list is just a function that always returns true. (lambda (f) #t)) (define (mynull? cell) ;; a pair is not the empty list (eq? cell myempty))

12 example > (define p1 (mycons 1 (mycons 2 myempty))) > p1 # > (mycar p1) 1 > (mycdr p1) # > (mycar (mycdr p1)) 2 > (mycdr (mycdr p1)) #

13 Representing pairs  We’ll define a subclass of list to represent a pair Class Pair(list) : pass  The cons functions creates a new cons cell with a given car and cdr def cons(car, cdr): return Pair([car, cdr])  Defining built-in functions for pairs will be easy def car(p): return p[0] def cdr(p): return p[1] def cadr(p): return car(cdr(p)) def set_car(p,x): p[0] = x

14 Lexical Analyzer  Consume a string of characters, identify tokens, throw away comments and whitespace, and return a list of remaining tokens  Each token will be a (, ) tuple like (‘number’, ‘3.145’) or (‘comment’, ‘;; foo’)  Recognize tokens using regular expressions  We won’t worry about efficiency

15 Token regular expressions PATTERNS = [ ('whitespace', re.compile(r'(\s+)')), ('comment', re.compile(r'(;[^\n]*)')), ('(', re.compile(r'(\()')), (')', re.compile(r'(\))')), ('dot', re.compile(r'(\.\s)')), ('number', re.compile(r'([+\-]?(?:\d+\.\d+|\d+\.|\.\d+|\d+))')), ('symbol', re.compile(r'([a-zA-Z\+\=\?\!\@\#\$\%\^\&\*\- \/\.\>\ \<]*)')), ('string', re.compile(r'"(([^\"]|\\")*)"')), ('\'', re.compile(r'(\')')), ('`', re.compile(r'(`)')), (',', re.compile(r'(,)')) ]

16 Lex Examples >>> from lex import * >>> tokenize("") [(None, None)] >>> tokenize(" 1 2..3 1.3 -4") [('number', '1'), ('number', '2.'), ('number', '.3'), ('number', '1.3'), ('number', '-4'), (None, None)] >>> tokenize('foo 12.3foo +') [('symbol', 'foo'), ('number', '12.3'), ('symbol', 'foo'), ('symbol', '+'), (None, None)] >>> tokenize('(foo (bar ()))') [('(', '('), ('symbol', 'foo'), ('(', '('), ('symbol', 'bar'), ('(', '('), (')', ')'), (')', ')'), (')', ')'), (None, None)]

17 Raw string notation >>> s = ‘\nfoo\n’ >>> s '\nfoo\n' >>> print s foo >>> s = r'\nfoo\n' >>> s '\\nfoo\\n' >>> print s \nfoo\n

18 tokenize() def tokenize(s): toks = [] found = True while s and found: found = False for type, regex in PATTERNS: match_obj = regex.match(s) if match_obj: if type not in ('whitespace', 'comment'): toks.append((type, match_obj.group(1))) s = s[match_obj.span()[1] :] found = True break if not found: print "\nNo match'", s, ”’ – tokenize” toks.append(EOF_TOKEN) return tokens

19 tokenize() examples >>> from lex import * >>> tokenize('(a 1.0)') [('(', '('), ('symbol', 'a'), ('number', '1.0'), (')', ')'), (None, None)] >>> tokenize('(define (add1 x)(+ x 1))') [('(', '('), ('symbol', 'define'), ('(', '('), ('symbol', 'add1'), ('symbol', 'x'), (')', ')'), ('(', '('), ('symbol', '+'), ('symbol', 'x'), ('number', '1'), (')', ')'), (')', ')'), (None, None)]

20 parse  Consume a sequence of tokens and produce a sequence of s-expressions  Use a recursive descent parser  We’ll handle just a few special cases, namely quote and backquote and dotted pairs 

21 Peeking and eating def peek(tokens): """Take a quick glance at the first token in our tokens list.""” if len(tokens) == 0: raise ParserError, "While peeking: ran out of tokens.” return tokens[0]

22 Peeking and eating def eat(tokens, desired_type): """If the type of the next token is desired_type, pop it from the list and return it, else return False””” if len(tokens) == 0: raise ParserError, 'No tokens left, seeking ' + desired_type return tokens.pop(0) if tokens[0][0] == desired_type else False

23 Peeking and eating def eat_safe(tokens, tokenType): """Digest the first token in our tokens list, making sure that we're biting on the right tokenType of thing.""” if len(tokens) == 0: raise ParserError, "While trying to eat %s: ran out of tokens." % tokenType ) if tokens[0][0] != tokenType: raise ParserError, "Seeking %s got %s" % (tokenType, tokens[0]) return tokens.pop(0)

24 parse def parseExpression(tokens): if eat(tokens, '\''): return cons(symbol('quote'), cons(parseExpression(tokens), NIL)) if eat(tokens, '`'): return cons(symbol('quasiquote'), cons(parseExpression(tokens), NIL)) elif eat(tokens, ','): return cons(symbol('unquote'), cons(parseExpression(tokens), NIL)) elif eat(tokens, '('): return parse_list_members(tokens) elif peek(tokens)[0] in ('number’,'symbol’,'string'): return parse_atom(tokens) else: raise ParserError, ”Parsing: no alternatives"

25 parse_list_members() def parse_list_members(tokens): if eat(tokens, 'dot'): final = parseExpression(tokens) eat_safe(tokens, ')') return final if peek(tokens)[0] in ('\'’,'`’,',’,'(’, 'number’,'symbol’,'string'): return cons(parseExpression(tokens), parse_list_members(tokens)) if eat(tokens, ')'): return NIL raise ParserError, "Can't finish list” + tokens

26 Recursive descent parsing  Remember one problem with recursive descent parsing is that the grammar has to be right recursive  Another potential problem is recursing too deeply and exceeding the limit on the stack  But maybe we can use tail recursion, which an interpreter or compiler can recognize and execute as iteration?  Not in Python 

27 Python doesn’t optimize tail recursion def fact0(n): # iterative facorial result = 1 while n>1: result *= n n -= 1 return result def fact1(n): # simple recursive factorial return 1 if n==1 else n*fact2(n - 1) def fact2(n, result=1): # tail recursive factorial return result if n==1 else fact2(n-1, n*result)

28 Try this http://www.csee.umbc.edu/331/fall08/0101/code/python/ pyscheme-1.7/src/fact.py

29 Default limit is 999 fact2(1000) and fact3(1000) both die >>> fact2(1000) Traceback (most recent call last): File " ", line 1, in File "fact.py", line 17, in fact2 return result if n==1 else fact2(n-1, n*result) File "fact.py", line 17, in fact2 … File "fact.py", line 17, in fact2 return result if n==1 else fact2(n-1, n*result) RuntimeError: maximum recursion depth exceeded

30 How to solve this?  You can set the maximum recursion depth higher >>> import sys >>> sys.getrecursionlimit() 1000 >>> sys.setrecursionlimit(10000) >>> fact2(1100) 53437084880926377034242155... 00000000L  But this is not a general solution  And Guido is on the record as not wanting to optimize tail recursion http://www.artima.com/forums/flat.jsp?forum=106&thread=147358

31 Trampoline Style  A trampoline is a loop that iteratively invokes thunk-returning functions A thunk is just a a piece of code to perform a delayed computation (e.g., a closure)  A single trampoline can express all control transfers of a program  Converting a program to trampolined style is trampolining This is kind of continuation passing style of programming  Trampolined functions can do tail recursive function calls in stack-oriented languages

32 Trampolining is one answer  A way to program using CPS, Continuation Passing Style  CPS is a style of programming where control is passed explicitly as continuations  Trampolining is a simple way to eliminate recursion  We’ll use a simple kind of trampolining  Instead of making a recursive call, a procedure can bounce back up to its caller with a continuation, which can be called to proceed with the computation

33 Pogo from pogo import pogo, land, bounce def fact3(n): # factorial in a trampolined style return pogo(fact_tramp(n)) def fact_tramp(n, result=1): return land(result) if n==1 else bounce(fact_tramp, n-1, n*result)

34 Variable length argument lists >>> def foo(*args): print "Number of arguments:", len(args) print "Arguments are: ", args >>> foo(1,2,3,'d',5) Number of arguments: 5 Arguments are: (1, 2, 3, 'd', 5) >>> def bar(arg1, *rest): print …

35 pogo.py def bounce(function, *args): """Returns new trampolined value that continues bouncing""" return ('bounce', function, args) def land(value): """Returns new trampolined value that lands off trampoline""" return ('land', value)

36 It works >>> sys.setrecursionlimit(10) >>> fact3(100) 93326215443944152681699238856266700490715968 2643816214685929638952175999932299156089414 6397615651828625369792082722375825118521091 6864000000000000000000000000L >>> fact3(1000) 4023872600770937735...00000000000000L

37 pogo.py def pogo(bouncer): try: while True: if bouncer[0] == 'land’: return bouncer[1] elif bouncer[0] == 'bounce': bouncer = bouncer[1](*bouncer[2]) else: traceback.print_exc() raise TypeError, "not a bouncer” except TypeError: traceback.print_exc() raise TypeError, "not a bouncer”

38 See pyscheme1.6  Pyscheme1.6 is written in trampoline style  Which was done by hand, as opposed to using an automatic trampoliner  And which I’ve been undoing by hand

39 def eval(exp, env): return pogo.pogo(teval(exp, env, pogo.land)) def teval(exp, env, cont): if expressions.isIf(exp): return evalIf(exp, env, cont) … def evalIf(exp, env, cont): def c(predicate_val): if isTrue(predicate_val): return teval(ifConsequent(exp), env, cont) else: return teval(ifAlternative(exp), env, cont) return teval(expressions.ifPredicate(exp), env, c)

40 eval def eval(exp, env): if exp.isSelfEvaluating(exp): return exp if exp.isVariable(exp): return env.lookupVariableValue(exp, env) if exp.isQuoted(exp): return evalQuoted(exp, env) if exp.isAssignment(exp): return evalAssignment(exp, env) if exp.isDefinition(exp): return evalDefinition(exp, env) if exp.isIf(exp): return evalIf(exp, env) if exp.isLambda(exp): return exp.makeProcedure(exp.lambdaParameters(exp), exp.lambdaBody(exp), env) if exp.isBegin(exp): return evalSequence(exp.beginActions(exp), env) if exp.isApplication(exp): return evalApplication(exp, env) raise SchemeError, "Unknown expr, eval " + str(exp)

41 apply def apply(procedure, arguments, env): if exp.isPrimitiveProcedure(procedure): return applyPrimProc(procedure, arguments, env) if exp.isCompoundProcedure(procedure): newEnv = env.extendEnvironment( exp.procedureParameters(procedure), arguments, exp.procedureEnvironment(procedure)) return evalSequence(exp.procedureBody(procedure), newEnv) raise SchemeError, "Unknown proc - apply " + str(procedure)

42 Environments  An environment will be a list of frames  Each frame will be a Python dictionary with the variable names as keys and their values as values

43 env THE_EMPTY_ENVIRONMENT = [] def enclosingEnvironment(env): return env[1:] def firstFrame(env): return env[0] def extendEnvironment(var_pairs, val_pairs, base): new_frame = {} vars = toPythonList(var_pairs) vals = toPythonList(val_pairs) if len(vars) != len vals: raise SchemeError, "Mismatched vals and vars" for (var, val) in zip(vars, vals): new_frame[var] = val return new_frame + base_env

44 Lookup a Variable Value def lookupVariableValue(var, env): while True: if env == THE_EMPTY_ENVIRONMENT: raise SchemeError,"Unbound var “+var frame = firstFrame(env) if frame.has_key(var): return frame[var] env = enclosingEnvironment(env)

45 Define/Set a Variable def defineVariable(var, val, env): firstFrame(env)[var] = val def setVariableValue(var, val, env): while True: if env == THE_EMPTY_ENVIRONMENT: raise SchemeError, "Unbound variable -- SET! " + var top = firstFrame(env) if top.has_key(var): top[var] = val return env = enclosingEnvironment(env)

46 Builtins  We’ll define a Python function to handle each of the primitive Scheme functions  Many List functions take any number of args: (+ 1 2) => 3 (+ 1 2 3 4 5) => 15 (+ ) => 0  We can takuse Python’s (new) syntax for functions that take any number or args, e.g.: If the last parameter in a function’s parameter list is preceded by a *, it’s bound to a list of the remaining args def add (*args): sum(args)

47 Builtins def allNumbers(numbers): for n in numbers: if type(n) not in (types.IntType, types.LongType, types.FloatType): return 0 return 1 def schemeAdd(*numbers): if not allNumbers(numbers): raise SchemeError, "prim + - non-numeric arg” return sum(numbers)

48 Setting up the initial environment def setupEnvironment(): PRIME_PROCEDURES = [ ["car", pair.car], ["cdr", pair.cdr], ["+", schemeAdd],... ] init_env = env.extendEnvironment( pair.NIL, pair.NIL, env.THE_EMPTY_ENVIRONMENT) for name, proc in PRIME_PROCEDURES: p = cons(symbol("primitive"), cons(proc, NIL)) defineVariable(symbol(name), p, env) defineVariable(symbol("#t"),symbol("#t"), init_env) defineVariable(symbol("#f"), symbol("#f"), init_env) return initial_environment


Download ppt "Scheme in Python. Overview  We’ll look at how to implement a simple scheme interpreter in Python  This is based on the Scheme in Scheme interpreter."

Similar presentations


Ads by Google