COMPILERS CLASS IV Er. Vikram Dhiman M.tech NIT jalandhar.

COMPILERS CLASS IV Er. Vikram Dhiman M.tech NIT jalandhar

Introduction Compiler: A Compiler is a program that can read a program in one language (Source) and translate it into an equivalent program in another language (Target) An important role of the compiler is to report any errors in the source program that it detects during the translation process

Compiler : Source code Target code COMPILER Report error

Target code : Target code is mostly an executable machine-language program. It can be called by the user to process inputs and produce outputs. InputOutput Target Program

MDR MAR IR PC ID Control Unit Accumulator R0 R1 R2 ALU Memory

Computer Architecture Accumulator: When a mathematical operation is going on, operator is temporarily stored in Accumulator Content of memory location and the result is saved or stored through Accumulator

MAR – Memory Address Register MDR – Memory Data Register Memory – Memory Block Read, Write Signals: MAR reads the data from the Memory block. MDR writes into the Memory block.

Program Counter : Program Counter contains address of next instruction. Initially it carries address of first instruction. Instruction Memory : All instructions are stored here. IR – Instruction Register ID – Instruction Decoder

ALU – Arithmetic Logic Unit All the arithmetic operations are performed in ALU 2 terminals to the Accumulator 1 terminal to the Registers Control Unit: The control unit coordinates the components of a computer system. It fetches the code of all of the instructions in the program

Example X  [P] + [Q] Instruction Memory Data Memory P,Q – Memory Address P QR

Steps: 1.MAR finds the Address 2.Data goes to MDR (Fetch P,Q) 3.P,Q stored in Accumulator to registers 4.Perform Add operation 5.Result is passed back to accumulator, from accumulator to MDR. Write action is performed and the value is stored in ‘X’

Address of P – 1000 Address of Q – 2000 Address of X – 2050 Solution: LDA 1000 MOV R0 LDA 2000 ADD R0 STA 2050

LDA – Load Data Address MOV – If we don’t use this instruction the value in accumulator will be over written STA – Stores the value in the memory address

Compilers

Working of System Software

Linkers  Inserts code to resolve program library references.  Combines object modules into an executable file.  Automatically called by the compiler.

Types of Linkers  Static Linker  After compilation, before execution.  Require more disk space and memory.  Faster and more portable.  Dynamic Linker  Resolve external references during execution.  Require less disk space and memory.  Execution time is more.

 Place the program into memory for execution.  Responsible for initiating the execution of process. Loaders

Types of Loaders  Absolute Loader  Instructions are placed directly at the location prescribed by the assembler.  Disadvantage:  Leads to address relocation problems.  Relocating Loader  Adjust addresses in the executable to compensate for variations in the address at which loading starts.  Disadvantage:  Memory references are bound to absolute address at the initial load time.

Types of Loaders  Dynamic Loader  Load only those files which are required at that time.  work like Dynamic linker.

Compilers

Analysis – Front End – Split source code into different constitute pieces(token). – Put the pieces based on grammatical rules(Parse). – Report Errors. Synthesis – Back End – Produce intermediate code – Optimize Intermediate code – Generate target code

Components of Compiler

Code Optimizer  Optimize the target code in terms of:  Size reduce size of target code  Time optimize/reduce size of target code  Power generate code which consumes less power  Space generate code which requires less memory

Assemblers Section 2

Outlines MS Fundamental functions of an assembler – A simple SIC assembler – Assembler algorithm and data structure Machine-dependent features – Instruction formats and addressing modes (SIC/XE) – Program relocation Machine-independent features – Literals – symbol-defining statements – Expressions – Program blocks – Control sections and program linking Design options: one-pass vs. multi-pass

MS The structure above consists of - 1. Instruction Interpreter 2. Location Counter 3. Instruction Register 4. Working Registers 5. General Register

The Instruction Interpreter Hardware is basically a group of circuits that perform the operation specified by the instructions fetched from the memory. The Location Counter can also be called as Program/Instruction Counter simply points to the current instruction being excuted.

The working registers are often called as the "scratch pads" because they are used to store temporary values while calculation is in progress. This CPU interfaces with Memory through MAR & MBR

MAR (Memory Address Register) - contains address of memory location (to be read from or stored into) MBR (Memory Buffer Register) - contains copy of address specified by MAR Memory controller is used to transfer data between MBR & the memory location specified by MAR The role of I/O Channels is to input or output information from memory.

Basic SIC Assembler Functions, Algorithm, and Data Structures

Fundamental Functions Mnemonic operation code Machine language Symbolic labels Machine addresses

Overview

Assembler Design: The most important things which need to be concentrated is the generation of Symbol table and resolving forward references. Symbol Table: – This is created during pass 1 – All the labels of the instructions are symbols – Table has entry for symbol name, address value.

Addressing System/360 uses truncated addressing. That means that instructions do not contain complete addresses, but rather specify a base register and a positive offset from the addresses in the base registers In the case of System/360 the base address is contained in one of 15 general registers

Assembler functions The basic assembler functions are:  Translating mnemonic language code to its equivalent object code.  Assigning machine addresses to symbolic labels.

Steps The design of assembler can be to perform the following: – Scanning (tokenizing) – Parsing (validating the instructions) – Creating the symbol table – Resolving the forward references – Converting into the machine language

The design of assembler in other words: – Convert mnemonic operation codes to their machine language equivalents – Convert symbolic operands to their equivalent machine addresses – Decide the proper instruction format Convert the data constants to internal machine representations – Write the object program and the assembly listing

Forward reference: – Symbols that are defined in the later part of the program are called forward referencing. – There will not be any address value for such symbols in the symbol table in pass 1.

SIC Assembly Program Line numbers (for reference) Address labels Mnemonic opcode operands comments Fixed format

Syntax of Assembly language When writing a program in assembly language it is necessary to observe specific rules in order to enable the process of compiling into executable “HEX-code” to run without errors. These compulsory rules are called syntax and there are only several of them: Every program line may consist of a maximum of 255 characters; Every program line to be compiled, must start with a symbol, label, mnemonics or directive; Text following the mark “;” in a program line represents a comment ignored (not compiled) by the assembler; and All the elements of one program line (labels, instructions etc.) must be separated by at least one space character. For the sake of better clearness, a push button TAB on a keyboard is commonly used instead of it, so that it is easy to delimit columns with labels, directives etc. in a program.

Assembler Directives Basic assembler directives (pseudo instructions): – START : Specify name and starting address for the program – END : Indicate the end of the source program, and (optionally) the first executable instruction in the program. – BYTE : Generate character or hexadecimal constant, occupying as many bytes as needed to represent the constant. – WORD : Generate one-word integer constant – RESB : Reserve the indicated number of bytes for a data area – RESW : Reserve the indicated number of words for a data area

SIC Assembler Assembler’s task: – Convert mnemonic operation codes to their machine language equivalents – Convert symbolic operands to their equivalent machine addresses – Build machine instructions in proper format – Convert data constants into internal machine representations (data formats) – Write object program and the assembly listing difficult

Assembly Program with Object Code Forward reference

Imp instruction would overlay our data in core Since the data itself occupies 4*300 =1200 bytes Moving the program to a different location is process called relocation. The use of base registers facilitates this process.

Need of Table processing Symbol table maintained by assembler. ST compose of Multiple word entries. Av time to find an entry is T(avg) = [overhead associated with entry probe ]* (N/2) LOCCTR (Location Counter) SYMBOL TABLE FIRST1000 CLLOP1003 BUFFER1039 RDREC2039

For example binary search

Search time versus N for small N we use linear search otherwise Binary Search

Object Program Format Header Col. 1H Col. 2~7Program name Col. 8~13Starting address of object program (hex) Col. 14-19Length of object program in bytes (hex) Text Col.1 T Col.2~7Starting address for object code in this record (hex) Col. 8~9Length of object code in this record in bytes (hex) Col. 10~69Object code, represented in hex (2 col. per byte) End Col.1E Col.2~7Address of first executable instruction in object program (hex) 1033-2038: Storage reserved by the loader

Two Pass SIC Assembler Pass 1 (define symbols) – Assign addresses to all statements in the program – Save the addresses assigned to all labels for use in Pass 2 – Perform assembler directives, including those for address assignment, such as BYTE and RESW Pass 2 (assemble instructions and generate object program) – Assemble instructions (generate opcode and look up addresses) – Generate data values defined by BYTE, WORD – Perform processing of assembler directives not done during Pass 1 – Write the object program and the assembly listing

Data Structures Pass 1Pass 2 Intermediate fileObject programSource program OPTAB SYMTAB LOCCTR Operation Code Table (OPTAB) Symbol Table (SYMTAB) Location Counter (LOCCTR)

Design of assembler Statement of problem:: Pseudo-op instruction START

Intermediate steps in assembling a program We read START instruction and note that it is pseudo-op instruction JOHN as the Name of this program and assembler must pass the name onto the loader USING pseudo-op tell the assembler that register 15 is the base register and at execution time will contain the address of the first instruction of the program.

Intermediate steps in assembling a program USING only inform the assembler what is the base register and apparently we can not BALR (which load the base register). Next come a LOAD instruction L 1, FIVE since no index register we put 0 for the index register. We maintain the LC indicating the relative address of the instruction and this counter is incremented by 4 bytes. (Length of the instrcution)

Intermediate steps in assembling a program Next instruction is ADD we look up the op-code but we do no know the offset for FOUR. The same thing happens to the Store instruction Then DC instruction is a pseudo-op directing us to define some data, and word will be stored at relative location 12 because the LC now has the value of 12, having being incremented by the length of each instruction. Next have LC 16, The label TEMP has an associated value 20.

Format of DB Use of Data bases by assembler. Third step in our design procedure.

MOT table

Pseudo op table

OPTAB Contents: – Mnemonic operation codes – Machine language equivalents – Instruction format and length During pass 1: – Validate operation codes – Find the instruction length to increase LOCCTR During pass 2: – Determine the instruction format – Translate the operation codes to their machine language equivalents Implementation: hash table

LOCCTR A variable accumulated for address assignment, i.e., LOCCTR gives the address of the associated label. LOCCTR is initialized to be the beginning address specified in the “start” statement. After each source statement is processed during pass 1, instruction length or data area is added to LOCCTR.

SYMTAB Contents: – Label name – Label address – Flags (to indicate error conditions) – Data type or length During pass 1: – Store label name and assigned address (from LOCCTR) in SYMTAB During pass 2: – Symbols used as operands are looked up in SYMTAB Implementation: – a dynamic hash table for efficient insertion and retrieval

Hence the process of the multi-pass assembler can be as follows: Pass-1  Assign addresses to all the statements  Save the addresses assigned to all labels to be used in Pass-2  Perform some processing of assembler directives such as RESW, RESB to find the length of data areas for assigning the address values.  Defines the symbols in the symbol table(generate the symbol table)

Pass-2  Assemble the instructions (translating operation codes and looking up addresses).  Generate data values defined by BYTE, WORD etc.  Perform the processing of the assembler directives not done during pass-1.  Write the object program and assembler listing.

Chap 2 Two Pass Assembler Pass 1 – Assign addresses to all statements in the program – Save the values assigned to all labels for use in Pass 2 – Perform some processing of assembler directives Pass 2 – Assemble instructions – Generate data values defined by BYTE, WORD – Perform processing of assembler directives not done in Pass 1 – Write the object program and the assembly listing

Assemblers modules

Pass 1 DB

Pass 2 DB

Pass 2 Db

Detail Pass 1 and Pass 2

Macro SECTION 3

Macro language and the Macro Processor Section 3 Concept – A macro instruction is a notational convenience for the programmer – It allows the programmer to write shorthand version of a program (module programming) – The macro processor replaces each macro invocation with the corresponding sequence of statements (expanding)

Comparison of Macro Processors Design Single pass – every macro must be defined before it is called – one-pass processor can alternate between macro definition and macro expansion – nested macro definitions may be allowed but nested calls are not Two pass algorithm – Pass1: Recognize macro definitions – Pass2: Recognize macro calls – nested macro definitions are not allowed

COMPILERS CLASS IV Er. Vikram Dhiman M.tech NIT jalandhar.

Similar presentations

Presentation on theme: "COMPILERS CLASS IV Er. Vikram Dhiman M.tech NIT jalandhar."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

COMPILERS CLASS IV Er. Vikram Dhiman M.tech NIT jalandhar.

Similar presentations

Presentation on theme: "COMPILERS CLASS IV Er. Vikram Dhiman M.tech NIT jalandhar."— Presentation transcript:

Similar presentations

About project

Feedback