Presentation on theme: "1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++"— Presentation transcript:
1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++ -> Assembly code Assembler - Asm -> Machine code (object) Linker -Object -> Executable Loader -Executable -> Execution in Memory
3 The Compiler The compiler transforms the C++ program into an assembly language program, a symbolic form of the machine language. High-level languages programs can be written in much less lines than assembly language, so programmer productivity ( תפוקה ) is high. In 1975 many operating systems, compilers and assemblers were written in assembly because compilers were inefficient and memories small. The increase in memory capacity has reduced program size concern and optimizing compilers produce assembly code as good as programmers.
4 The Assembler Assembly language is the interface between high- level Programming Languages (PLs) and machine code. The assembler can add instructions that aren't implemented in hardware. These are called pseudoinstructions. The use of them simplify translation and programming. The pseudoinstruction mov $t0,$t1 is converted by the assembler into the true machine instruction add $t0,$zero,$t1. The assembler converts branches to faraway locations into a branch and jump.
5 The Object File The assembler turns the assembly code into an object file, which contains machine code, data, and information needed to place instructions in memory. The assembler must map the labels in assembly code to addresses in machine code. This information is kept in the symbol table. After converting all labels to addresses the symbol table contains the remaining labels that aren't defined, such as external data or procedures. Each C++ source file is translated into one assembly code file which is then translated to one object file.
6 Object File Structure The object file for Unix systems contains six parts: Object file header - size and position of the other parts of the file. Text segment - the machine code. Data segment - static data that comes with the program. Relocation information - identifies instructions and data that depend on absolute addresses when the program is loaded into memory. Symbol table - labels to external references. Debugging information - links machine instructions to C++ statements.
7 The Linker A single change to one line of the program requires compiling and assembling the whole program. This is wasteful as most code won't be touched by the programmer, even code such as standard libraries which he/she didn't write, will be recompiled. An alternative is to compile and assemble each procedure independently. A change to a procedure will require compiling only a single procedure. The link editor or linker takes all the independent object files and links them together. The output of the linker is the executable file or executable.
9 Linking Steps There are 3 steps for the linker: Place code and data symbolically in memory. Determine the addresses of data and instruction labels. Patch both the internal and external references. The linker uses the relocation information and symbol table in each object module to find all undefined labels. These labels are found in branch and jump instructions and in data addresses. It finds the old addresses and replaces them with new addresses. It is faster to "patch" the code than recompile.
10 Memory Locations If all the external references are resolved the linker determines the memory location of all procedures and data. Since the files were assembled separately, the assembler can't know where a modules code and data will reside in memory relative to other modules. When the linker places a module in memory all absolute references, memory addresses not relative to a register, must be relocated to their true addresses.
11 MIPS Memory Allocation The stack starts at top and grows down towards the data segment. The program code starts at 0x40000. The static data starts at 0x1000000. Dynamic data (data allocated by new ) starts right after it. The $gp is situated to make it easy to access the static data.
15 The Executable File Contains a header, the text segment and the data segment. The separate modules now reside together in the text and data segments. All the unresolved addresses in the link stage are now resolved. This file can now be run in the computer. In the debug stage of development the executable will contain debug information. After development is finished the file is stripped of debug information.
16 The Loader The loader performs the following steps (Unix): Reads the executable to find out the size of the text and data. Creates an address space large enough. Copies the instructions and data into memory. Copies parameters to the main program onto the stack. Initializes the machines registers and sets the stack pointer. Jumps to a start-up procedure that copies the parameters into the argument registers and calls the main procedure ( main() in C++)