CS 598 Scripting Languages Design and Implementation 12. Interpreter implementation.

CS 598 Scripting Languages Design and Implementation 12. Interpreter implementation

Definition From the wikipedia: – an interpreter is a computer program that directly executes, i.e. performs, instructions written in a programming or scripting language, without previously compiling them into a machine language program. An interpreter generally uses one of the following strategies for program execution: 1.Parse the source code and perform its behavior directly. 2.Translate source code into some efficient intermediate representation and immediately execute this. 3.Explicitly execute stored precompiled code made by a compiler which is part of the interpreter system. 2

References [DeCa90] Eddy H. Debaere & Jan M. Van Campenhout. Interpretation and Instruction Path Coprocessing.MIT Press (1990). [SmNa05] Jim Smith and Ravi Nair. 2005. Virtual Machines: Versatile Platforms for Systems and Processes (The Morgan Kaufmann Series in Computer Architecture and Design). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. Also website maintaned by Anton Ertl – http://www.complang.tuwien.ac.at/forth/threaded- code.html 3

Compilers and interpreters 4 From [DeCa90]

Native code vs the intermediate code of interpreters There are two main approacheds to implementing a dynamic language: – Translate to native code and have the machine execute it directly – Translate to intermediate code and have interpreters carry out the computation. They can be compared along multiple dimensions as discussed in [DeCa90] 5

Native code vs the intermediate code of interpreters(cont.) Execution speed: Native code is faster Portability: intermediate code can enable portability. Debugging: Native code has a greater semantic gap with the HLL than that between interpreter code (bytecode) and HLL. This makes debugging easier. Representation size: Intermediate code produces smaller code Extensibility: Changing the semantics of the HLL always requires changing the compiler with native code implementation. In some cases, the change of semantics can be implemented by chanign the semantics of the intermediate code which only requires changing the interpreter Interactivity: Interpreters are better at supporting interactivity. 6

Overhead of interpretation: The interpretive loop The execution mechanism of most interpreters does not differ from the execution mechanism of von Neumann machines. The method consists of repeatedly executing the following steps: 1.Locating the next intermediate instruction to be executed, retrieving and analyzing it; this involves: reading and updating an (intermediate) program counter reading the contents of the intermediate instruction memory addressed by this program counter; sectioning the intermediate instruction into different fields (format, op code and operands); storing those fields into a set of fixed locations, which act as an interface to the second step; transferring control to the routine that corresponds to the opcode of the instruction just decoded (second step). 2.Executing the semantics that correspond to the analyzed intermediate instruction. 3.Transferring control back to the instruction executing step 1. 7

Overhead of interpretation: Unboxing, data allocation From Haichuan Wang, Peng Wu, David A. Padua: Optimizing R VM: Allocation Removal and Path Length Reduction via Interpreter- level Specialization. CGO 2014: 295 Discussion in terms of the R interpreter 8

Overhead of interpretation: An example 9

Overhead of interpretation: SEXPREC objects 10

Overhead of interpretation: Unboxing 11

Results obtained 12

Languge oriented architectures 13 From [DeCa90]

Language-oriented architectures From wikipedia: – There is a wide variety of systems under [the heading of High-level language computer architecture]. The most extreme example is a Directly Executed Language, where the instruction set architecture of the computer equals the instructions of the HLL, and the source code is directly executable with minimal processing. In extreme cases the only compilation required is tokenizing the source code and feeding the tokens directly to the processor; this is found in stack-oriented programming languages running on a stack machine. For more conventional languages the HLL statements are grouped into instruction + arguments, and infix order is transformed to prefix or postfix order. DELs are typically only hypothetical, though they were advocated in the 1970s Disadvantages: – Complex hardware – Language dedicated hardware 14

Language-oriented architectures Language directed architectures possess constructs that are borrowed from one or more HLLs. – Examples: Access protection Cimplex addressing modes Loop instruction Decrement and branch instructions Multiple operand instructions Array bound checking Compilers must be able to exploit these constructs, 15

Language-oriented architectures Language corresponding architectures raise the level of the machine languahe to that of the HLL. An example would be a machine that executes Smalltalk byte codes directly. – Easy debugging. – Small representation size. – Compilers with low complexity. – Complex hardware. 16

Efficiency of the interpreter: A decode and dispatch interpreter 17 indirect branch branch From [SmNa05]

Efficiency of the interpreter: Another decode and dispatch interpreter (inlining subroutines) bytecode: 0 /*pushA*/ 1 /*pushB*/ 2 /*add*/ top: i = decode(vpc++) addr = table[i] jump *addr pushA: *sp++ = A jump top pushB: *sp++ = B jump top add: addend = *--sp *sp = *sp + addend jump top 18 From the wikipedia entry on threaded code

Indirect threaded interpretation 19 The jump top is no longer needed From [SmNa05]

Indirect threaded interpretation 20 bytecode: 0 /*pushA*/ 1 /*pushB*/ 2 /*add*/ top: i = decode(vpc++) addr = table[i] jump *addr pushA: *sp++ = A i = decode(vpc++) addr = table[i] jump *addr pushB: *sp++ = B i = decode(vpc++) addr = table[i] jump *addr add: addend = *--sp *sp = *sp + addend i = decode(vpc++) addr = table[i] jump *addr thread: &i_pushA &i_pushB &i_add i_pushA: &push &A i_pushB: &push &B i_add: &add push: *sp++ = *(*ip + 1) jump *(*ip++) add: addend = *--sp *--sp = *sp + addend jump *(*ip++) From the wikipedia entry on threaded code

21 From [DeCa90]

Using addresses instead of opcodes Direct threaded 22 From [SmNa05]

Using addresses instead of opcodes Direct threaded 23 From [SmNa05]

Using addresses instead of opcodes Direct threaded 24 start: ip = &thread top: jump *ip++ thread: &pushA &pushB &add... pushA: *sp++ = A jump top pushB: *sp++ = B jump top add: *sp++ = *--sp + *--sp jump top thread: &pushA &pushB &add... pushA: *sp++ = A jump *ip++ pushB: *sp++ = B jump *ip++ add: addend = *--sp *--sp = *sp + addend jump *ip++ From the wikipedia entry on threaded code

25 From [DeCa90]

Decode and dispatch vs Threaded 26 From [SmNa05]

Call threaded 27 typedef void (* Inst)(); Inst *ip; void inst1() {... } void engine() { for (;;) (*ip++)(); } From: http://www.complang.tuwien.ac.at/forth/threaded-code.html

Suroutine threaded code 28 thread: call pushA call pushB call add ret pushA: *sp++ = A ret pushB: *sp++ = B ret add: addend = *--sp *sp = *sp + addend re From the wikipedia entry on threaded code

29 From [DeCa90]

Linked intermediate representation 30 From [DeCa90]

Performance Experiments 31 http://www.complang.tuwien.ac.at/forth/threading/

CS 598 Scripting Languages Design and Implementation 12. Interpreter implementation.

Similar presentations

Presentation on theme: "CS 598 Scripting Languages Design and Implementation 12. Interpreter implementation."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 598 Scripting Languages Design and Implementation 12. Interpreter implementation.

Similar presentations

Presentation on theme: "CS 598 Scripting Languages Design and Implementation 12. Interpreter implementation."— Presentation transcript:

Similar presentations

About project

Feedback