Presentation on theme: "Using Ada 95 in a Compiler Course SIGAda 2001 S. Tucker Taft CTO AverCom Corp., a Titan Company October 3, 2001 Bloomington, MN."— Presentation transcript:
Using Ada 95 in a Compiler Course SIGAda 2001 S. Tucker Taft CTO AverCom Corp., a Titan Company October 3, 2001 Bloomington, MN
Outline What are we trying to teach in a compiler course? How can Ada help? The Approach Used In This Course The Ada Package Structure and What the Student Builds Conclusion
What are we trying to teach? Compiler Theory and Compiler Construction Techniques Tackling a large, complex problem and reducing it to manageable pieces (and getting it all to work!) Using the data structures and algorithms learned in earlier courses in creative ways; choosing the right ones to use in each circumstance Using Object-Oriented Programming Techniques in a large, real-world problem Using Ada in a large, real-world problem
How Can Ada Help? Minimize time wasted in debugging Emphasize high-level (package / subsystem) structure and interfaces Support approach where Professor provides (visible-part of) package spec, while students provide package (private part and) body Illustrate the value of abstracting even simple integral types like line numbers, hash codes, lexical levels
The Approach Used in This Course Focus on phases and their abstractions Understand high-level structure: LexingSource => Lexemes ParsingLexemes (leaves) => AST SemanticsAST => AAST/SymTab IR GenerationAAST/SymTab => IR Optional Flow OptimizationIR => (better) IR Instruction SelectionIR => Pseudo Asm Register AllocationPseudo Asm => Real Asm
Package / Subsystem Structure Package/Subsystem for each Phase Package/Subsystem for each Abstraction Source Lexemes StringTab ASTsSymTa b LexerParserSemIR Gen IR PAsm Inst Sel Asm RegAlc Language -Specific Machine- Specific Flow Interp Output
Lexer and Parser Phases Lexer: Abstractions Source File (Abstract Stream of Characters) Source position (File, Line, Column) Lexeme/Token (Tagged Type Hierarchy) String/Identifier/Reserved Word (Hash) Table Error/Warning Message Generation Processing Token Building (Finite State Automaton) Parser: Abstractions Abstract Syntax Tree (AST) -- Building Routines Processing LALR Parsing and AST Building Fruit := Apple + Pear;
Semantics Phase Abstractions: Annotations for AST (Tagged Type Hierarchy) Lexical Visibility Stack (LVS) of Symbol Tables Tables Hashed on String ID Tables Stored as Annotations on Program Unit Entries refer to Annotations on Individual Declarations Processing: Walk AST Build LVS, Symbol Tables, and Annotations Look Up All Identifier References Implemented As Dispatching Operations of AST Node Type
Interpreter Abstractions: Run-Time Display (Analogous to LVS) Run-Time Value (Tagged Type Hierarchy) Processing: Walk AST Build/Use Display of Values Follow into Subprogram Bodies Analogous to Inlining at Compile Time Invoke Builtin RTS Subprograms (E.g. Put_Line) Implemented As Dispatching Ops on AST Node Type
Intermediate Representation (IR) Generator Abstractions: Low-Level LVS and IR Symbol Table Analogous to Run-Time Display IR (Tagged Type Hierarchy) -- Building Routines IR Stream -- For Declarations, Statements, and Side- Effects IR Trees -- For Pure Expression Evaluation Processing: Walk Annotated AST (AAST) Buil/Used Low-Level LVS and SymTab Generate IR Implemented as Dispatching Ops on AST Node Type
Instruction Selection Abstractions: Temp (Virtual Register) Table Pseudo Assembly Instructions (Tagged Type Hierarchy) and Stream Thereof Database of IR (Tree) Patterns and Corresponding Pseudo Assembly Sequences Processing: Walk IR Match IR Trees Against Database of PatternsDatabase Maximal Munch (Top-Down) or Dynamic Programming (Bottom Up) Maximal Munch Can Be Implemented As Dispatching Ops of IR Tree Node Type Generate Instructions and Create/Use Temps (Virtual Registers)
Register Allocation Abstractions: Basic Blocks (Flow Graph) Live Sets (Temps alive at entry/exit of Basic Blocks) Register Map Temp => Physical Register or Spill Location Processing: Compute Live Sets at entry/exit of Basic Blocks Iterate until they stabilize Instance of more general iterative flow graph algorithms: Start with ideal case (e.g. nothing alive) Iterate away from that until stabilize Iterate until no more spills: Generate Conflict/Affinity Matrix Perform Register Coloring/Spilling/Coalescing Produce Register Map Use Register Map to Produce Real Assembly Code
A Couple of Pedagogical Issues Use Dispatching Ops or Visitor Pattern? How Much and What Code To Provide?
Use Dispatching Ops or Visitor Pattern? Semantics, Interpreter, IRGen All Implementable As Dispatching Ops of AST Node Maximal Munch Instruction Selection Implementable As Dispatching Op of IR Node Visitor Pattern more complicated, and more work Single Tree Walk Dispatching Op Takes Visitor Parameter Create Visitor Type Extension for Each Phase Use (Compile-Time) Overloading to select Dispatching Operation of Visitor Object to Call OO Moral Equivalent of Switch/Case Statement?
Answer: Let Students Experiment and Choose Interesting Lesson in Tradeoffs between Simplicity, Flexibility, Maintainability Dispatching Operations are Simpler Visitor Pattern allows new phase to be added without touching AST abstraction But… Add a new AST node, and must track down all Phases and make sure Pre/Post-Visit operations are updated Reminiscent of Switch/Case maintenance problems But Hopefully many fewer of them to find
FYI: Visitor Example (short quiz next period) package AST is type AST_Node is abstract tagged … type Visitor_Root is abstract tagged null record; procedure Walk(Node : access AST_Node; Visitor : access Visitor_RootClass) Is abstract;... with AST.Exprs, AST.Stmts, AST.Decls package AST.Visitors is type Visitor is abstract new AST.Visitor_Root with null record; procedure Pre_Visit(Visitor: access AST_Visitor; Tree : access AST.Exprs.Binary_Op); procedure Post_Visit(Visitor : access AST_Visitor; Tree : access AST.Exprs.Binary_Op); procedure Pre_Visit(Visitor: access AST_Visitor; Tree : access AST.Exprs.Unary_Op); procedure Post_Visit(Visitor : access AST_Visitor; Tree : access AST.Exprs.Unary_Op); procedure Pre_Visit(Visitor: access AST_Visitor; Tree : access AST.Stmts.Asgn_Stmt); procedure Post_Visit(Visitor : access AST_Visitor; Tree : access AST.Exprs.Asgn_Stmt); … with AST.Visitors, AST.Exprs, AST.Stmts, AST.Decls; package Interpreter is type Interp_Visitor is new AST.Visitors.Visitor with … procedure Post_Visit(Visitor: access Interp_Visitor; Tree : access AST.Exprs.Binary_Op); procedure Post_Visit(Visitor: access Interp_Visitor; Tree : access AST.Exprs.Unary_Op);...
How Much and What Code to Provide? Names and Explanations of Phases and Abstractions Actual Package Specs Package Specs and Sample Code for some of the operations Depends on Phase or Abstraction
How Much and What Code to Provide (contd) Abstractions => Provide Package Specs Processing => Explain algorithms Visitor Pattern vs. Disp. Op Experiment => Provide Sample Code as well
Conclusions A Compiler Course is a Treasure Trove of Learning Experiences Ada 95 is an excellent language for teaching a compiler course Package / Subsystem Structure helps to reinforce Compiler phase/abstraction structure Compile-time and run-time checks dramatically reduce debugging time Readability should make the Professor Happy ;-) Someday real soon now... There will be a simple compiler written in Ada 95 available for use in teaching