Presentation is loading. Please wait.

Presentation is loading. Please wait.

Polyglot An Extensible Compiler Framework for Java Nathaniel Nystrom Michael R. Clarkson Andrew C. Myers Cornell University.

Similar presentations


Presentation on theme: "Polyglot An Extensible Compiler Framework for Java Nathaniel Nystrom Michael R. Clarkson Andrew C. Myers Cornell University."— Presentation transcript:

1 Polyglot An Extensible Compiler Framework for Java Nathaniel Nystrom Michael R. Clarkson Andrew C. Myers Cornell University

2 2 Language extension Language designers often create extensions to existing languages e.g., C++, PolyJ, GJ, Pizza, AspectJ, Jif, ArchJava, ESCJava, Polyphonic C#,... Want to reuse existing compiler infrastructure as much as possible Polyglot is a framework for writing compiler extensions for Java

3 3 Requirements Language extension Modify both syntax and semantics of the base language Not necessarily backward compatible Goals: Easy to build and maintain extensions Extensibility should be scalable No code duplication Compilers for language extensions should be open to further extension

4 4 Rejected approaches In-place modification Macro languages Limited to syntax extensions Semantic checks after macro expansion base compiler 1.0 bug fixes & upgrades base compiler 2.0 copy & modify extension compiler 1.0 copy & modify (again) extension compiler 2.0 bug fixes & upgrades (again)

5 5 Polyglot Base compiler is a complete Java front end 25K lines of Java Name resolution, inner class support, type checking, exception checking, uninitialized variable analysis, unreachable code analysis,... Can reuse and extend through inheritance

6 6 Scalable extensibility Most compiler passes are sparse: AST Nodes Passes Changes to the compiler should be proportional to changes in the language. +ifxe.f= name resolution type checking exception checking constant folding +ifxe.f= name resolution type checking exception checking constant folding +ifxe.f= name resolution type checking exception checking constant folding +ifxe.f= name resolution type checking exception checking constant folding +ifxe.f= name resolution type checking exception checking constant folding

7 7 Non-scalable approaches Visitors pass as AST node method (“naive OO”) Polyglot Easy to add or modify PassesAST nodes    Using

8 8 Java source Java target Java parser Code generator AST rewriting passes Base Polyglot compiler Polyglot architecture Ext source Java target Ext parser Code generator AST rewriting passes Ext 2 source Java target Ext 2 parser Code generator AST rewriting passes

9 9 Architecture details Parser written using PPG Adds grammar inheritance to Java CUP AST nodes constructed using a node factory Decouples node types from implementation AST rewriting passes: Each pass lazily creates a new AST From naive OO: traverse AST invoking a method at each node From visitors: AST traversal factored out

10 10 Example: PAO Primitive types as subclasses of Object Changes type system, relaxes Java syntax Implementation: insert boxing and unboxing code where needed HashMap m; m.put(“two”, 2); int v = (int) m.get(“two”); HashMap m; m.put(“two”, new Integer(2)); int v = ((Integer) m.get(“two”)).intValue();

11 11 PAO implementation Modify parser and type-checking pass to permit e instanceof int Parser changes with PPG: include “java.cup” drop { rel_expr ::= rel_expr INSTANCEOF ref_type } extend rel_expr ::= rel_expr:a INSTANCEOF type:b {: RESULT = node_factory.Instanceof(a, b); :} Add one new pass to insert boxing and unboxing code

12 12 Implementing a new pass Want to extend Node interface with rewrite() method Default implementation: identity translation Specialized implementations: boxing and unboxing Mixin extensibility: extensions to a base class should be inherited by subclasses typeCheck() codeGen() cond then else typeCheck() codeGen() lhs rhs typeCheck() codeGen() Node IfAdd typeCheck() codeGen() rewrite() cond then else typeCheck() codeGen() rewrite() lhs rhs typeCheck() codeGen() rewrite()

13 13 Inheritance is inadequate typeCheck() codeGen() cond then else typeCheck() codeGen() lhs rhs typeCheck() codeGen() Node IfAdd typeCheck() codeGen() rewrite() cond then else typeCheck() codeGen() rewrite() lhs rhs typeCheck() codeGen() rewrite() PaoNode PaoIfPaoAdd

14 14 Inheritance is inadequate typeCheck() codeGen() Node typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() PaoNode typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite() typeCheck() codeGen() rewrite()

15 15 Extension objects Use composition to mixin methods and fields into AST node classes ext typeCheck() codeGen() ext cond then else typeCheck() codeGen() ext lhs rhs typeCheck() codeGen() Node If Add ext rewrite() PaoExt PAO extension objects; installed into all nodes by node factory null

16 16 Extension objects Extension objects have their own ext field to leave extension open ext typeCheck() codeGen() ext cond then else typeCheck() codeGen() ext lhs rhs typeCheck() codeGen() Node If Add ext rewrite() PaoExt ext typeCheck() ext_type_info null

17 17 Method invocation A method may be implemented in the node or in any one of several extension objects. Extension should call node.ext.ext.typeCheck() Base compiler should call: node.typeCheck() Cannot hardcode the calls ext typeCheck() codeGen() Node ext rewrite() PaoExt ext typeCheck() ext_type_info null

18 18 Delegate objects Each node & extension object has a del field Delegate object implements same interface as node or ext Directs call to appropriate method implementation Ex: node.del.typeCheck() Ex: node.ext.del.rewrite() Run-time overhead < 2% del ext typeCheck() codeGen() Node del ext rewrite() PaoExt del ext typeCheck() ext_type_info typeCheck() codeGen() { node.ext.ext.typeCheck() } { node.codeGen() } JavaDel null

19 19 Scalable extensibility To add a new pass: Use an extension object to mixin default implementation of the pass for the Node base class Use extension objects to mixin specialized implementations as needed To change the implementation of an existing pass Use delegate object to redirect to method providing new implementation To create an AST node type: Create a new subclass of Node Or, mixin new fields to existing node using an extension object

20 20 Polyglot family tree Polyglot base (Java) parameterized types CofferPolyJJif PAO Jif/split JMatchcovariant return

21 21 Results Can build small extensions in hours or days 10% of base code is interfaces and factories Extension# Tokens% of Base Polyglot base (Java)166K100 Jif129K78 JMatch108K65 Jif/split99K60 PolyJ79K48 Coffer24K14 PAO6.1K3.6 parameterized types3.2K2 covariant return1.6K1 javac 1.1132K80

22 22 Related work Other extensible compilers e.g., CoSy, SUIF e.g., JastAdd, JaCo Macros e.g., EPP, Java Syntax Extender, Jakarta e.g., Maya Visitors e.g., staggered visitors, extensible visitors

23 23 Conclusions Several Java extensions have been implemented with Polyglot Programmer effort scales well with size of difference with Java Extension objects and delegate objects provide scalable extensibility Download from: http://www.cs.cornell.edu/projects/polyglot

24 24 Acknowledgments Brandon BrayJMatch Michael BrukmanPPG Steve ChongJif, Jif/split, covariant return Matt HarrenJMatch Aleksey KligerJLtools, PolyJ Jed LiuJMatch Naveen SastryJLtools Dan SpoonhowerJLtools Steve ZdancewicJif, Jif/split Lantian ZhengJif, Jif/split http://www.cs.cornell.edu/projects/polyglot

25 Questions?

26 26 Mixin extensibility typeCheck() codeGen() cond then else typeCheck() codeGen() lhs rhs typeCheck() codeGen() Node IfAdd typeCheck() codeGen() rewrite() cond then else typeCheck() codeGen() rewrite() lhs rhs typeCheck() codeGen() rewrite() Inheritance does not provide mixin extensibility: when a base class is extended, subclasses should inherit the changes

27 27 Other Polyglot features Quasi-quoting library Useful for translation from extension language AST to base language or intermediate language AST qqStmt(“if (%e.relabelsTo(%e)) %s; else %s;”, new Object[] { L, Li, then_body, else_body }); Automatic separate compilation Serialize type information and store in the AST Encoded into the class file via javac Extracted from class file using reflection Data-flow analysis framework

28 28 PAO rewriting rewrite(ts) called for each AST node: class PaoExt extends Ext { Node rewrite(PaoTypeSystem ts) { return node(); } } class PaoInstanceofExt extends PaoExt { Node rewrite(PaoTypeSystem ts) { Instanceof e = (Instanceof) node(); Type rtype = e.compareType(); // e.g., “e instanceof int”  “e instanceof Integer” if (rtype.isPrimitive()) return e.compareType(ts.boxedType(rtype)); else return n; } }

29 29 Node factories Each extension has a node factory (nf) To create a node of type T, call method nf.T() T() may return an extension-specific subclass of T T() attaches the extension and delegate objects to T via a call to extT() Mixin extensibility: if T is a subclass of S, then extT() will call extS()

30 30 Results Can build small extensions in hours, days 10% of base compiler code is interfaces and factories ExtensionLexe r Parse r Total% of Base javac 1.1 (excl. bytecode asm) 119 K 72 Polyglot base2.7K11K166 K 100 Jif2.7K5.4K129 K 78 JMatch2.7K14K108 K 65 PolyJ2.7K3.5K79K48 Coffer2.7K 24K14 PAO2.7K1696.1 K 3.6 Param003.2 K 2 covariant return001.6 K 1

31 31 Why not output bytecode? Wanted to be able to read the output The symmetry is satisfying Limitations of Java as a target language Scoping rules sometimes make it difficult to output Java code, especially with inner classes Lack of goto can make generated control flow inefficient

32 32 Name resolution Translation Semantic checking


Download ppt "Polyglot An Extensible Compiler Framework for Java Nathaniel Nystrom Michael R. Clarkson Andrew C. Myers Cornell University."

Similar presentations


Ads by Google