Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS412/413 Introduction to Compilers and Translators May 3, 1999 Lecture 34: Compiler-like Systems JIT bytecode interpreter src-to-src translator bytecode.

Similar presentations


Presentation on theme: "CS412/413 Introduction to Compilers and Translators May 3, 1999 Lecture 34: Compiler-like Systems JIT bytecode interpreter src-to-src translator bytecode."— Presentation transcript:

1 CS412/413 Introduction to Compilers and Translators May 3, 1999 Lecture 34: Compiler-like Systems JIT bytecode interpreter src-to-src translator bytecode compiler

2 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 2 Administration Project code must be turned in Friday even if you haven’t finished the programming assignment -- at minimum, turn in something that can compile programs. If parts of program are incomplete, include explanation of design, and components that are working

3 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 3 A Compiler Lexical analysis Parsing Intermediate Code Generation MIR Optimization Code generation LIR optimization HIR Optimization Linking & Loading “Compiler Classic” bytecode compiler JIT compiler front end back end source-to- source translator interpreter

4 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 4 Other options Compile from source to another high-level language (source-to-source translator), let other compiler deal with code gen, etc. Compile from source to an intermediate code format for which a back end already exists (ucode, RTF, LCC,...) Compile from source to an intermediate code format executable by an interpreter –abstract syntax tree –bytecodes –threaded code (call- or pointer-threaded) Advantage: Architecture independence

5 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 5 Source-to-source translator Idea: choose well-supported high-level language (e.g. C, C++, Java) as target Translate AST to high-level language constructs instead of to IR, pass translated code off to underlying compiler Advantage: easy, can leverage good underlying compiler technology. Examples: C++ (to C), PolyJ (to Java), Toba (JVM to C) Disadvantages: target language won’t support all features, optimization harder in target language, language may impose extra checks

6 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 6 Compiling to C C doesn’t impose extra checks, is reasonably close to assembly (but can’t support static exception tables!), widely available Mismatch: doesn’t allow statements underneath expressions; need to translate and canonicalize in one step Translation of an expression into C is: –a sequence of statements to be executed –an expression to be evaluated afterward E  S 1 ;…;S n, E’

7 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 7 Translation rules Translation still can be performed by recursive traversal of AST Iota  C rules: E  S, E’ id = E;  S, id = E’; S i  S i ’, E i ’ { S 1 ;…; S n }  S 1 ’;…; S n ’, E n ’ block assignment

8 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 8 Translating to Java Same problems as C, plus: Java is type- safe (good in a HLL, not so good in an intermediate language!) May need to use cast and instanceof expressions in generated code: slow in Java

9 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 9 Back Ends Several standard intermediate code formats exist with back ends for various architectures -- can reuse back ends –p-code: very old stack machine format –UCODE : old Stanford/MIPS stack machine format –Java bytecode: new stack machine format –RTL: GNU gcc, etc. –SUIF: Stanford format for optimization –LCC: Lightweight C compiler

10 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 10 Intermediate code formats Quadruples –compact, similar to machine code, good for standard optimization techniques Stack-machine –easy to generate code for –not compact (contrary to popular opinion) –can be converted back into quadruples –used by some (sort of) high-level languages: FORTH, PostScript, HP calculators

11 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 11 Stack machine format Code is a sequence of stack operations (not necessarily the same stack as the call stack) push CONST : add CONST to the top of stack pop : discard top of stack store : in memory location specified by top of stack store element just below. load : replace top of stack with memory location it points to +, *, /, -, … : replace top two elems w/ result of operation

12 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 12 Generating code Stack operations mostly don’t name operands (implicit): can code in 1 byte Expression is translated to code that leaves its value on the top of stack Translation of E 1 + E 2 :  E 1 + E 2    E 1  ;  E 2  ; + Translation of id = E :  id = E    E  ; push addr(id); store Good for very simple code generation

13 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 13 Verbosity Values get “trapped” down low on stack especially with subexpression elim. Need instruction that re-pushes element at known stack index on top; instruction is used often Might as well have abstract register operands! Result: less compact than a register- based format; extra copies of data too

14 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 14 Stack machine  register machine At each point in code, keep track of stack depth (if possible) Assign temporaries according to depth Replace stack operands with quadruples using these temporaries abc a + b*c push a ; 0 push b ; 1 push c ; 2 * ; 1 + ; 0 t0 = a t1 = b t2 = c t1 = t1 * t2 t0 = t0 + t1 012 ab*c a+b*c

15 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 15 JIT compilers Particularly widely available back end with well-defined intermediate code (JVM bytecode) Generate code by reconstructing registers as discussed Compilation is done on-the-fly: generating code quickly is essential  generated code quality is usually low HotSpot: new Sun JIT. High-quality optimization (esp. inlining and specialization), but used sparingly

16 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 16 Interpreters “Why generate machine code at all? Just run it. Processors are really fast” Options: –token interpreters (parsing on the fly) -- reallly slow (>1000x) –AST interpreters -- 300x –threaded interpreters -- 20-50x –bytecode interpreters -- 10-30x

17 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 17 AST interpreters “Yet another recursive traversal” For every node type in AST, add method Object evaluate(RunTimeContext r) Evaluate method is implemented recursively Object PlusNode.evaluate(r) { return left.evaluate(r).plus(right.evaluate(r)); } Variables, etc. looked up in r ; some help from AST yields big speed-ups (e.g. pre- computed variable locations)

18 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 18 Threaded interpreters Code is represented as a bunch of blocks of memory connected by pointers. Blocks are either code blocks or pointer blocks. Execution is (mostly) depth-first traversal Each block starts with a single jump instruction JMP next JMP machine code RET

19 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 19 Threaded interpreters Can call blocks directly regardless of kind Conditional branches, other control structure embedded in special machine code blocks that munge stack Interpreter proper is ~10 instructions of code: very space-efficient! Long favored by FORTH fans: whole language runtime fits in 4K memory

20 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 20 Call-threaded interpreters Instead of pointers in pointer blocks, store CALL instructions Depth-first traversal done by processor automatically Faster, trivial code generation required CALL machine code RET CALL RET CALL RET

21 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 21 Bytecode interpreters Problem with threaded interpreters: operations with immediate arguments require multiple words (code bloat) Observation: most interpreter instructions can be encoded in less than a word, many even in just a byte (esp. with stack-machine model) Tradeoff: interpreter size vs. executable code size

22 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 22 Implementing a bytecode interpreter Bytecode interpreter simulates a simple architecture (either stack or register machine) Interpreter state: –current code pointer –current simulated return stack –current registers or stack & stack pointer Interpreter code is a big loop containing a switch over kinds of bytecode instructions –one big function: optimization techniques work well. Ideally: no recursion on function calls Result: about 10x slowdown if done right

23 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers 23 Summary Building a new system for executing code doesn’t require construction of a full compiler Cost-effective strategies: source-to-source translation or translation to an existing intermediate code format Most material covered in this course is still helpful for these approaches High performance: translate to C Portability, extensibility: translate to Java or JVM


Download ppt "CS412/413 Introduction to Compilers and Translators May 3, 1999 Lecture 34: Compiler-like Systems JIT bytecode interpreter src-to-src translator bytecode."

Similar presentations


Ads by Google