Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Intermediate Code, virtual machine implementation

Similar presentations


Presentation on theme: "Introduction to Intermediate Code, virtual machine implementation"— Presentation transcript:

1 Introduction to Intermediate Code, virtual machine implementation
Lecture 27 Prof. Fateman CS 164 Lecture 27

2 Prof. Fateman CS 164 Lecture 27
Where do we go from here? testing AST generation Type checking Cleanup Interpreter in intermediate code/ assmblr Virtual machine out Prof. Fateman CS 164 Lecture 27

3 Prof. Fateman CS 164 Lecture 27
Details of Tiger  what the VM must support Details of the VM  what IC to generate in intermediate code/ assmblr Virtual machine out Prof. Fateman CS 164 Lecture 27

4 What is Intermediate Code? No single answer…
Any encoding further from the source and closer to the machine (in our case, no line/column numbers is one step!) Generally relatively portable May support several languages (C, C++, Pascal, Fortran) where the resources needed are similar. Languages with widely differing semantics may require different IC. E.g. Java supporting Scheme is hard. Prof. Fateman CS 164 Lecture 27

5 Forms of Intermediate Code
Virtual stack machine simplified “macro” machine 3-address code A :=B op C Register machine models with unlimited number of registers, mapped to real registers later Tree form (= lisp program, more or less) Prof. Fateman CS 164 Lecture 27

6 Using Intermediate Code
We need to make some progress towards the machine we have in mind Subject to manipulation, “optimization” Perhaps several passes Or, we can generate assembler Or, we could plop down in absolute memory locations the binary instructions needed to execute the program “compile-and-go” system Prof. Fateman CS 164 Lecture 27

7 In practice, given a tree representation like an AST
Instead of typechecking or interpreting the code, we are traversing it and putting together a list of instructions… generating code… that if run on a machine -- would execute the program. In other words instead of typechecking a loop,or going through a loop executing it, we write it out in assembler. Prof. Fateman CS 164 Lecture 27

8 Prof. Fateman CS 164 Lecture 27
The simplest example Compiling the Tiger program 3+4 The string “3+4” parses to (OpExp (1 . 2) + (IntExp (1 . 1) 3) (IntExp (1 . 3) 4) typechecker approves and says it is of type int. cleanup program produces (+ 3 4) We now want to produce code. Prof. Fateman CS 164 Lecture 27

9 Just a simple stack machine
Compiling (+ 3 4) (comp '(+ 3 4))  ((pushi 3) (pushi 4) (+)) Result is a list of assembly language instructions push immediate constant 3 on stack push immediate constant 4 on stack + = take 2 args off stack and push sum on stack Conventions: result is left on top of stack. Location of these instructions unspecified. Lengthy notes in ic file describe virtual machine. Prof. Fateman CS 164 Lecture 27

10 It doesn’t have to look like lisp if we write a printing program
(show-comp '(+ 3 4)) ;; prints out… pushi 3 pushi 4 + Prof. Fateman CS 164 Lecture 27

11 We need to have interacting programs, so lets make functions
;; We are shuffling around fns all the time in this compiler (defstruct fn name params code env) (top-comp1 '(+ 3 4)) #S(fn :name main :params nil :code ((args 0) (pushi 3) (pushi 4) (+) (exit 0)) :env (#(…))) (show-fn *) ;;prints this out args 0 pushi 3 pushi 4 + exit 0 nil ;;return value Conventions (args 0): no params. Exit is opcode to return to supervisor level of simulator. Not return from subroutine. Prof. Fateman CS 164 Lecture 27

12 How do we convert to machine code?
(setf r (top-comp1 '(+ 3 4))) #S(fn :name main :params nil :code ((args 0) (pushi 3) (pushi 4) (+) (exit 0)) :env (#(…))) Assembler does this… makes list into vector/array.. (assemble r) ;; MODIFIES :code part of r :code #((args 0) (pushi 3) (pushi 4) (+) (exit 0)) We have ignored :env. It is a pointer to an array of Global functions… print, chr, ord, flush, … Prof. Fateman CS 164 Lecture 27

13 A slightly more interesting example
(setf r (top-comp1 '(IfExp 1 "yes" "no")) ) #S(fn :name main :params nil …..) (show-fn r) ;; slightly more sophisticated show-fn; byte counts 0: args 0 4: pushi 1 8: jumpz L0003 12: pusha ="yes" 16: jump L0004 L0003: 24: pusha ="no" L0004: 32: exit 0 nil Prof. Fateman CS 164 Lecture 27

14 That example,assembled, has no labels
(assemble r) #S(fn :name main :params nil …..) (show-fn r) 0: args 0 4: pushi 1 8: jumpz 20 12: pusha ="yes" 16: jump 24 20: pusha ="no" 24: exit 0 nil Prof. Fateman CS 164 Lecture 27

15 The 2-pass assembler… (50 lines of lisp)
(in-package :tiger) (defun label-p(x)(symbolp x)) ; is x a label? (defun opcode (instr) (if (label-p instr) :label (first instr))) (defun args (instr) (if (listp instr) (rest instr))) (defun arg1 (instr) (if (listp instr) (second instr))) (defun arg2 (instr) (if (listp instr) (third instr))) (defsetf arg1 (instr) (val) `(setf (second ,instr) ,val)) ;; defsetf method makes (setf (arg1 x) foo) ;; mean the same as (setf (second x) foo) ;; we want to change arg1 in (opcode arg1 …) Prof. Fateman CS 164 Lecture 27

16 Prof. Fateman CS 164 Lecture 27
The 2-pass assembler… (defun assemble (fn) "Turn a list of instructions in a function into a vector." ;; count up the instructions and keep track of the labels ;; make a note of where the labels are (program counter ;; relative to start of function). Create a vector of instructions. ;; The transformed "assembled" program can support fast ;; jumps forward and back. (multiple-value-bind (length labels);extract 2 items from 1st pass (asm-first-pass (fn-code fn)) (setf (fn-code fn) ;; put vector back instead of list (asm-second-pass (fn-code fn) length labels)) fn)) Prof. Fateman CS 164 Lecture 27

17 The 1st pass counts up locations
(defun asm-first-pass (code) ;;return 2 values: the total code length ;;and the labels assoc list (let ((length 0) (labels nil)) (dolist (instr code) (if (label-p instr) (push (cons instr length) labels) ;;assume instructions are 4 bytes (incf length 4))) (values (/ length 4) labels))) Prof. Fateman CS 164 Lecture 27

18 The 2nd pass resolves labels, makes vector
(defun asm-second-pass (code length labels) ;;Put code into code-vector, adjusting for labels. ;;For each instruction that refers to a label, ;;put the position in for the label. (let ((addr 0) (code-vector (make-array length))) (dolist (instr code) (unless (label-p instr) (if (is instr '(jump jumpn jumpz save)) ;; a setf method that changes label to number (setf (arg1 instr) (cdr (assoc (arg1 instr) labels)))) (setf (aref code-vector addr) instr) (incf addr ))) code-vector)); returns a vector of code! Prof. Fateman CS 164 Lecture 27

19 The Tiger Virtual Machine
The easy parts Prof. Fateman CS 164 Lecture 27

20 It is a stack-based architecture simulated by a lisp program
The machine is started up running a particular function, f The program counter in that function is set to 0. In general the environment will look like a stack of arrays. This is a peculiarly lispish way of doing things, and could be done other, more efficient ways. On calling a new function a “macro” operation ARGS is used to copy arguments from the stack to the environment. This makes calling rather pricey. However, it makes return very neat. n-args is the number of args supplied to this function instr is a temp variable. Prof. Fateman CS 164 Lecture 27

21 We set up an update-program-counter/ execute loop
(defun machine (f) (let* ((code (fn-code f)) ;an array of instructions (pc 0) (env (fn-env f)) (stack nil) (n-args 0) ;main program has no arguments (instr nil)) (loop ;; pick out the instruction at the given pc (setf instr (elt code (ash pc -2))) ; address ¥ 4 (if *debug-pc* (format t "~%Executing ~s at pc ~s; code is ~s" (fn-name f) pc instr)) (incf pc 4) ;; we assume 4 bytes per instruction (case (opcode instr) ;; ALL THE OPCODES GO HERE Prof. Fateman CS 164 Lecture 27

22 Prof. Fateman CS 164 Lecture 27
Sample Opcodes (defun machine (f) ;;; … stuff left out (case (opcode instr) ;; Variable/stack manipulation instructions: (lvar ; (lvar arg1 arg2) ;; take the value in the environment at offset arg2 ;; in offset arg1, and push on the computation stack (push (elt (elt env (arg1 instr)) (arg2 instr)) stack)) (lset (setf (elt (elt env (arg1 instr)) (arg2 instr)) (top stack))) (pop (pop stack))…. Prof. Fateman CS 164 Lecture 27

23 Prof. Fateman CS 164 Lecture 27
Architecture for arithmetic is based on underlying lisp arithmetic, e.g. + ´ #’+ (defun machine (f) ;;; … stuff left out (case (opcode instr) ;; … more stuff left out ;; arithmetic: use defn from the interpreter ((+ - * / < > <= >= = =) (setf stack (cons (funcall (cadr (assoc (opcode instr) tig-init-env)) (second stack) (first stack)) (cddr stack)))) ;; Other: exit provides exit code.. ((exit nil) (return (top stack))) Prof. Fateman CS 164 Lecture 27

24 Machine + assembler in assemble.cl
250 lines of code, including comments  (defun run (exp) ;;exp is (cleanup ast) "compile and run tiger code" (machine (assemble (top-comp1 exp)))) Prof. Fateman CS 164 Lecture 27


Download ppt "Introduction to Intermediate Code, virtual machine implementation"

Similar presentations


Ads by Google