Linker and Loader. Program building into four stages (C Program) Preprocessing (Preprocessor) It processes include files, conditional compilation instructions.

Slides:



Advertisements
Similar presentations
Programs in Memory Bryce Boe 2012/08/29 CS32, Summer 2012 B.
Advertisements

Fabián E. Bustamante, Spring 2007 Linking Today Static linking Object files Static & dynamically linked libraries Next time Exceptional control flows.
Program Development Tools The GNU (GNU’s Not Unix) Toolchain The GNU toolchain has played a vital role in the development of the Linux kernel, BSD, and.
Linking & Loading CS-502 Operating Systems
Chapter 3 Loaders and Linkers
Loaders and Linkers CS 230 이준원. 2 Overview assembler –generates an object code in a predefined format »COFF (common object file format) »ELF (executable.
Linkers and Loaders 1 Linkers & Loaders – A Programmers Perspective.
Mehmet Can Vuran, Instructor University of Nebraska-Lincoln Acknowledgement: Overheads adapted from those provided by the authors of the textbook.
Memory Management Chapter 7.
COMPILING OBJECTS AND OTHER LANGUAGE IMPLEMENTATION ISSUES Credit: Mostly Bryant & O’Hallaron.
Instructor: Erol Sahin
Winter 2015 COMP 2130 Intro Computer Systems Computing Science Thompson Rivers University Linking.
Generating Programs and Linking Professor Rick Han Department of Computer Science University of Colorado at Boulder.
“The course that gives CMU its Zip!”
CS 330 What software engineers need to know about linking and a few things about execution.
Guide To UNIX Using Linux Third Edition
Linking Topics Static linking Object files Static libraries Loading Dynamic linking of shared libraries CS213.
1 uClinux course Day 3 of 5 The uclinux toolchain, elf format and ripping a “hello world”
CS 201 Computer Systems Organization. Today’s agenda Overview of how things work Compilation and linking system Operating system Computer organization.
OBJECT MODULE FORMATS. The object module format we have employed as an educational device is called OMF (relocatable object format). It’s one of the earliest.
Carnegie Mellon 1 Linking Lecture, Apr. 11, 2013 These slides are from website which accompanies the book “Computer Systems: A.
MIPS coding. SPIM Some links can be found such as:
Linking and Loading Linker collects procedures and links them together object modules into one executable program. Why isn't everything written as just.
Topic 2d High-Level languages and Systems Software
CS 367 Linking Topics Static linking Object files Static libraries
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Linking February 20, 2001 Topics static linking object files static libraries loading dynamic linking of shared libraries class16.ppt “The course.
Computer System Chapter 7. Linking Lynn Choi Korea University.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /29/2013 Lecture 13: Compile-Link-Load Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE.
Linking Topics Static linking Object files Static libraries Loading.
Linking Topics Static linking Dynamic linking Case study: Library interpositioning.
Linking Ⅱ.
Processes and Threads-I Static and dynamic linking, Loading, Anatomy of a Process.
Linking October 5, 2002 Topics static linking object files static libraries loading dynamic linking of shared libraries Reading: Chapter 7 Problems: 7.8.
CS412/413 Introduction to Compilers and Translators April 14, 1999 Lecture 29: Linking and loading.
Linking Summer 2014 COMP 2130 Intro Computer Systems Computing Science Thompson Rivers University.
Computer System Organization. Overview of how things work Compilation and linking system Operating system Computer organization Today’s agenda.
1 CS503: Operating Systems Spring 2014 Part 0: Program Structure Dongyan Xu Department of Computer Science Purdue University.
Chapter 13 : Symbol Management in Linking
1 Linking. 2 Outline What is linking and why linking Complier driver Static linking Symbols & Symbol Table Suggested reading: 7.1~7.5.
CSc 453 Linking and Loading
CS252: Systems Programming Ninghui Li Based on Slides by Gustavo Rodriguez-Rivera Topic 2: Program Structure and Using GDB.
Linking I Topics Assembly and symbol resolution Static linking Systems I.
Computer System Chapter 7. Linking Lynn Choi Korea University.
Program Translation and Execution I: Linking Sept. 29, 1998 Topics object files linkers class11.ppt Introduction to Computer Systems.
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
Hello world !!! ASCII representation of hello.c.
Carnegie Mellon 1 Introduction to Computer Systems /18-243, fall th Lecture, Sep 29 th Instructors: Roger B. Dannenberg and Greg Ganger 1.
Linking Winter 2013 COMP 2130 Intro Computer Systems Computing Science Thompson Rivers University.
Operating Systems A Biswas, Dept. of Information Technology.
Object Files & Linking. Object Sections Compiled code store as object files – Linux : ELF : Extensible Linking Format – Windows : PE : Portable Execution.
Binding & Dynamic Linking Presented by: Raunak Sulekh(1013) Pooja Kapoor(1008)
Carnegie Mellon Introduction to Computer Systems /18-243, spring th Lecture, Feb. 26 th Instructors: Gregory Kesden and Markus Püschel.
Lecture 3 Translation.
Slides adapted from Bryant and O’Hallaron
CS 367 Linking Topics Static linking Make Utility Object files
Overview of today’s lecture
Linking Topics Static linking Object files Static libraries Loading
Linking & Loading.
Program Execution in Linux
CS-3013 Operating Systems C-term 2008
“The course that gives CMU its Zip!”
“The course that gives CMU its Zip!”
Linking.
Linking & Loading CS-502 Operating Systems
Introduction to Computer Systems
Program Execution in Linux
Linking & Loading CS-502 Operating Systems
Instructors: Majd Sakr and Khaled Harras
“The course that gives CMU its Zip!”
Presentation transcript:

Linker and Loader

Program building into four stages (C Program) Preprocessing (Preprocessor) It processes include files, conditional compilation instructions and macros. Command: $cpp hello.c hello.i hello.c is source program, hello.i is ASCII intermediate code Compiling (Compiler) It takes the output of the preprocessor and generates assembler source code Command: $cc hello.i –o hello.s Assembly (Assembler) It takes the assembly source code and produces an assembly listing with offsets. The assembler output is stored in an object file. Command:$as –o hello.o hello.s Linking (Linker) It takes one or more object files or libraries as input and combines them to produce a single (usually executable) file. Linux command for linker is ld

Example C Program int buf[2] = {1, 2}; int main() { swap(); return 0; } main.cswap.c extern intbuf[]; int*bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; }

Static Linking Programs are translated and linked using a compiler driver: unix>gcc -O2 -g –o p main.c swap.c unix>./p Linker (ld) Translators (cpp, cc1, as) main.c main.o Translators (cpp, cc1, as) swap.c swap.o p Source files Separately compiled relocatable object files Fully linked executable object file (contains code and data for all functions defined in main.c and swap.c )

Static Linking Unix ld program takes as input a collection of relocatable object files and command line arguments and generate as output a fully linked executable object file that can be loaded. Relocatable object file consist of code and data sections. Code section contains read-only instruction binary Data section contains initialized global variables and uninitialized global variables

Three Kinds of Object Files (Modules) Relocatable object file (.o file) Contains code and data in a form that can be combined with other relocatable object files to form executable object file. Each.o file is produced from exactly one source (.c ) file Executable object file ( a.out file) Contains code and data in a form that can be copied directly into memory and then executed. Shared object file (.so file) Special type of relocatable object file that can be loaded into memory and linked dynamically, at either load time or run-time. Called Dynamic Link Libraries (DLLs) by Windows

Executable and Linkable Format (ELF) Standard binary format for object files Originally proposed by AT&T System V Unix Later adopted by BSD Unix variants and Linux One unified format for Relocatable object files (.o ), Executable object files (a.out ) Shared object files (.so ) Generic name: ELF binaries

ELF Object File Format Elf header Word size, byte ordering, file type (.o, exec,.so), machine type, etc. Segment header table Page size, virtual addresses memory segments (sections), segment sizes..text section Code.rodata section Read only data: jump tables,....data section Initialized global variables.bss section Uninitialized global variables “Block Started by Symbol” “Better Save Space” ELF header Segment header table (required for executables).text section.rodata section.bss section.symtab section.rel.txt section.rel.data section.debug section Section header table 0.data section

ELF Object File Format (cont.).symtab section Symbol table Procedure and static variable names Section names and locations.rel.text section Relocation info for.text section Addresses of instructions that will need to be modified in the executable Instructions for modifying..rel.data section Relocation info for.data section Addresses of pointer data that will need to be modified in the merged executable.debug section Info for symbolic debugging ( gcc -g ) Section header table Offsets and sizes of each section ELF header Segment header table (required for executables).text section.rodata section.bss section.symtab section.rel.txt section.rel.data section.debug section Section header table 0.data section

Building Executable involves two tasks Linker performs two main tasks to build executable Symbol Resolution Object file define and reference symbols. Symbol resolution associates each symbol reference to its definition. Relocation Compiler and Assembler generate code and data sections that start at address zero. Linker relocates these sections by associating a memory location with each symbol definition.

Linker Symbols Global symbols Symbols defined by module m that can be referenced by other modules. E.g.: non- static C functions and non- static global variables. External symbols Global symbols that are referenced by module m but defined by some other module. Local symbols Symbols that are defined and referenced exclusively by module m. E.g.: C functions and variables defined with the static attribute. Local linker symbols are not local program variables

Resolving Symbols int buf[2] = {1, 2}; int main() { swap(); return 0; } main.c extern int buf[]; Int *bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } swap.c Global External Local Global Linker knows nothing of temp Global

Relocating Code and Data main() main.o int *bufp0=&buf[0] swap() swap.o int buf[2]={1,2} Headers main() swap() 0 System code int *bufp0=&buf[0] intbuf[2]={1,2} System data More system code System data Relocatable Object FilesExecutable Object File.text.data.text.data.text.data.symtab.debug.data int *bufp1.bss System code static int *bufp1.bss Even though private to swap, requires allocation in.bss

Symbol Table It is built by the assembler using the symbols exported by the compiler in assembly language.s file. Assembler converts the.s file into.o relocatable obj file. This.o file is in ELF format which contain a section called.symtab. Each entry in.symtab is as below

Example Symbol Table Entries

Symbol Resolution Linker associates each symbol reference to its definition Symbol resolution is pretty straightforward for referencing local symbols that are defined in the same module. When assembler finds a symbol that is defined in the current module, it generates an entry in the symbol table deferring resolution by linker. If linker even does not fine its definition then it printa an error message.

Example;

Symbol Resolution for multiply defined global symbols If same symbol is defined in multiple object files, then linker prints an error. There are different cases and linker handles them based on type of symbol (strong/weak) and rules to deal with multiply defined symbols. Compiler exports each symbol as strong/weak to the assembler

Strong and Weak Symbols Strong Symbols Functions and initialized global variables are Strong Symbols. Weak Symbols Uninitialized global variables are weak symbols

Linker’s Symbol Rules Rule 1: Multiple strong symbols are not allowed Each item can be defined only once Otherwise: Linker error Rule 2: Given a strong symbol and multiple weak symbol, choose the strong symbol References to the weak symbol resolve to the strong symbol Rule 3: If there are multiple weak symbols, pick an arbitrary one Can override this with gcc –fno-common

Example

Linker Puzzles int x; p1() {} int x; p2() {} int x; int y; p1() {} double x; p2() {} int x=7; int y=5; p1() {} double x; p2() {} int x=7; p1() {} int x; p2() {} int x; p1() {} Link time error: two strong symbols ( p1 ) References to x will refer to the same uninitialized int. Is this what you really want? Writes to x in p2 might overwrite y ! Evil! Writes to x in p2 will overwrite y ! Nasty! Nightmare scenario: two identical weak structs, compiled by different compilers with different alignment rules. References to x will refer to the same initialized variable.

Linking with Static Libraries Linker reads the relocatable object files, links them and generates executable object file. Static Library is a collection of common object modules. These libraries are supplied as a input to linker. Linker copies only the referenced object modules and link them. Common functions such as atoi(),printf(),scanf(),strcpy etc are available in libc.a. Functions such as sin,cos,sqrt etc are available in libm.a.

Static Library Linking: Different Approaches Approach 1: Put all functions into a single source file Programmers link big object file into their programs Space and time inefficient Approach 2: Put each function in a separate source file Programmers explicitly link appropriate binaries into their programs More efficient, but burdensome on the programmer

Solution: Static Libraries Static libraries (. a archive files) Concatenate related relocatable object files into a single file with an header that describes the size and location of each member object module(called an archive). Enhance linker so that it tries to resolve unresolved external references by looking for the symbols in one or more archives. Linker will copy the object modules that are referenced by the program, which reduces size of the executable on the dosk and in memory. If an archive member file resolves reference, link it into the executable.

Creating Static Libraries Translator atoi.c atoi.o Translator printf.c printf.o libc.a Archiver (ar)... Translator random.c random.o unix>arrslibc.a \ atoi.oprintf.o … random.o C standard library Archiver allows incremental updates Recompile function that changes and replace.o file in archive.

Commonly Used Libraries libc.a (the C standard library) 8 MB archive of 1392 object files. I/O, memory allocation, signal handling, string handling, data and time, random numbers, integer math libm.a (the C math library) 1 MB archive of 401 object files. floating point math (sin, cos, tan, log, exp, sqrt, …)

Symbol Resolution with Static Libraries Programmer enters the.o files and.a files on the command line for linking $gcc main.c /usr/lib/libc.a /usr/lib/libm.a Linker scans it from left to right in sequential order to resolve the references Linker maintains three sets E : Set of relocatable object files that will be merged to form an executable U: set of unresolved references D: Set of symbols that are defined in previous input files.

Steps Linker determines whether the input file is object file or an archive If f is a object file, linker adds it to E, updates U and D to reflect the symbol definition and references in f and proceed to the next input file If the input file f is an archive, linker attempts to match the unresolved symbols in U against the symbol defined in archive. If there is any unresolved symbol in U, linker prints an error. If.a files are independent, then write in any order But order is imp if archive files are dependent

Relocation After symbol resolution, linker has to do relocation where it has to merge the input modules and assigns run-time addresses to each symbol. This is done in two steps Relocation sections and symbol definition Here linker merges all sections of the same type into a new aggregate section of the same type. (.dat or.text) Linker assigns run-time memory address to each new section, to each symbol Relocation symbol references within sections Using relocation entries Linker decides run-time addresses of each symbol here

ELF executable object file

Details of Segment header table

Loading Executing Object Files $./a.out or $./exe Invokes execve function or system call to invoke loader Loader copies the executable obj file from disk to memory and runs it from its entry point.

Shared Libraries Static libraries have the following disadvantages: Duplication in the stored executables (every function need std libc) Duplication in the running executables Minor bug fixes of system libraries require each application to explicitly relink Modern solution: Shared Libraries Object files that contain code and data that are loaded and linked into an application dynamically, at either load-time or run-time Also called: dynamic link libraries, DLLs,.so files