Presentation is loading. Please wait.

Presentation is loading. Please wait.

Interfacing with ELF files

Similar presentations


Presentation on theme: "Interfacing with ELF files"— Presentation transcript:

1 Interfacing with ELF files
An introduction to the Executable and Linkable Format (ELF) binary file specification standard

2 Overview of source translation
User-created files C/C++ Source and Header Files Assembly Source Files Makefile C/C++ Source and Header Files Assembly Source Files Linker Script File preprocessor Make Utility compiler assembler Object Files Object Files Archive Utility Library Files Library Files Linker and Locator Shared Object File Linkable Image File Executable Image File Link Map File

3 Executable versus Linkable
ELF Header ELF Header Program-Header Table (optional) Program-Header Table Section 1 Data Segment 1 Data Section 2 Data Segment 2 Data Section 3 Data Segment 3 Data Section n Data Segment n Data Section-Header Table Section-Header Table (optional) Linkable File Executable File

4 Role of the Linker ELF Header ELF Header Section 1 Data
Program-Header Table Section 2 Data Section n Data Segment 1 Data Section-Header Table Segment 2 Data Linkable File Segment n Data ELF Header Section 1 Data Section 2 Data Executable File Section n Data Section-Header Table Linkable File

5 ELF Header e_ident [ EI_NIDENT ] e_type e_machine e_version e_entry e_phoff e_shoff e_flags e_ehsize e_phentsize e_phnum e_shentsize e_shnum e_shstrndx Section-Header Table: e_shoff, e_shentsize, e_shnum, e_shstrndx Program-Header Table: e_phoff, e_phentsize, e_phnum, e_entry NOTE: The sizes of these fields, and their arrangement, is slightly different for the ELF64 files that are produced by default on our x86_64 Linux workstations.

6 Section-Headers sh_name sh_type sh_flags sh_addr sh_offset sh_size
sh_link sh_info sh_addralign sh_entsize NOTE: These are for the ELF32 file-format.

7 Program-Headers p_type p_offset p_vaddr p_paddr p_filesz p_memsz
p_flags p_align NOTE: These are for the ELF32 file-format.

8 Official ELF documentation
The official document that describes ELF file-formats for both the ‘linkable’ and the ‘executable’ files is available online on our CS630 course website (see ‘Resources’) (Be aware that this document has been revised to accommodate programs that will be run on platforms which implement 64-bit addresses and processor registers)

9 Memory: Physical vs. Virtual
Portions of physical memory are “mapped” by the CPU into regions of each task’s ‘virtual’ address-space Virtual Address Spaces (4 GB) Physical address space (4 GB)

10 Linux ‘Executable’ ELF files
An Executable ELF32 file produced by the Linux linker is configured to execute in a private ‘virtual’ address space, whereby every program gets loaded at the identical virtual memory-address (i.e., 0x ) We will soon study the x86 CPU’s paging mechanism which makes this possible (i.e., after we have finished Project #1)

11 Linux ‘Linkable’ ELF files
It is possible that some ‘linkable’ ELF files are self-contained (i.e., they may not need to be linked with any other object-files, or with any shared libraries) Our ‘manydots.o’ is one such example So we can write our own system-code that can execute the instructions contained in a stand-alone ‘linkable’ object-module, using the CPU’s ‘segmented’ physical memory

12 Our ‘loadmap.cpp’ utility
We created a tool that ‘parses’ a linkable ELF file, to identify each section’s length, type, and location within the object-module For those sections containing the ‘text’ and ‘data’ for the program, we build segment-descriptors, based on where the linkable image-file will reside in physical memory Then we jump to the ‘_start’ entry-point

13 32-bit versus 16-bit code Linux’s compilers, and the ‘as’ assembler, can produce object-files that are intended to reside in ’32-bit’ memory-segments (i.e., the D-bit in a code-segment descriptor is set to 1) This affects the CPU’s interpretation of all the machine-instructions it subsequently fetches Our ‘as’ assembler can produce both16-bit and 32-bit code (although its default is 64-bit code) We employ ‘.code32’ or ‘.code16’ directives

14 Example: ‘as’ Listing .code32 0x0000 01 D8 add %eax, %ebx
0x D8 add %ax, %bx 0x nop .code16 0x D8 add %eax, %ebx 0x D8 add %ax, %bx 0x000B nop .end

15 Demo-program We created a Linux program (‘linuxapp.s’) that invokes two system-calls (‘write’ and ‘exit’) We assembled it with the ‘as’ assembler: $ as linuxapp.s –o linuxapp.o This linkable ELF object-file ‘linuxapp.o’ should then be written to our hard-disk partition (‘/dev/sda4’) at sector 65, using the ‘dd’ utility: $ dd if=linuxapp.o of=/dev/sda4 seek=65 So it will get loaded into memory by ‘cs630ipl’

16 Memory-Map ‘linuxapp.o’ image ‘tryelf32.b’ image Both ‘tryelf32.b’ and
‘linuxapp.o’ will get loaded into ram from sectors of the disk-partition by our ‘cs630ipl.b’ program-loader ‘linuxapp.o’ image 0x ‘tryelf32.b’ image 0x BOOT-LOADER 0x00007C00 ‘cs630ipl.b’ is read from CS630 disk-partition via ROM-BIOS bootstrap hard disk ROM-BIOS DATA 0x IVT

17 Segment Descriptors We created 32-bit segment-descriptors for the ‘text’ and ‘data’ sections of ‘linuxapp.o’ (in a Local Descriptor Table) with DPL=3 For the ‘.text’ section: offset in ELF file = 0x34 size = 0x24 So its segment-descriptor is: .word 0x0023, 0x8034, 0xFA01, 0x0040 (base-address = load-address + file-offset)

18 Descriptors (continued)
For the ‘.data’ section: offset in ELF file = 0x58 size = 0x16 So its segment-descriptor is: .word 0x0015, 0x8058, 0xF201, 0x0040 (base-address = load-address + file-offset) For our ring3 stack (not part of ELF file): .word 0x0000, 0x0000, 0xF602, 0x00C0 Note: It’s an ‘expand-down’ data-segment!

19 ‘Expand-Down’ segments
limit segment limit base-address base-address Normal ‘Expand-Up’ Data-Segment Special ‘Expand-Down’ Data-Segment

20 Task-State Segment Because any system-calls (via int 0x80) will cause privilege-level transitions, we will need to setup a Task-State Segment (to store a ring0 stack-pointer SS0:ESP0) theTSS: .long 0, 0, # 3 longwords Its segment-descriptor goes into our GDT: .word 0x000B, theTSS, 0x8901, 0x0000

21 Transition to Ring 3 Recall that we use ‘lret’ to enter ring-3:
pushw $userSS pushw $0 pushw $userCS lret (NOTE: This assumes we are coming from a 16-bit code-segment in protected-mode)

22 System-Call Dispatcher
All system-calls get ‘vectored’ through our IDT’s interrupt-gate number 0x80 For ‘linuxapp.o’ we only need to implement two system-calls: ‘exit’ and ‘write’ But to simplify future enhancements, we use a ‘jump-table’ anyway (although for now it has a few ‘dummy’ entries, which can easily be modified later on)

23 System-Call ID-numbers
System-call ID #0 (it will never be needed) System-call ID #1 is for ‘exit’ (required) System-call ID #2 is for ‘fork’ (deferred) System-call ID #3 is for ‘read’ (deferred) System-call ID #4 is for ‘write’ (required) System-call ID #5 is for ‘open’ (deferred) System-call ID #6 is for ‘close’ (deferred) (NOTE: over 300 system-calls exist in Linux)

24 Defining our jump-table
sys_call_table: .long do_nothing # for service 0 .long do_exit # for service 1 .long do_nothing # for service 2 .long do_nothing # for service 3 .long do_write # for service 4 .equ NR_SYS_CALLS, ( . - sys_call_table)/4

25 Setting up IDT Gate 0x80 The Descriptor Privilege Level must be 3
The Gate-Type should be ‘386 Trap-Gate’ The entry-point will be our ‘isrSVC’ label # Interrupt Descriptor Table’s entry for SuperVisor Call (int $0x80) mov $0x80, %ebx # table-entry array-index lea theIDT(, %ebx, 8), %di # descriptor offset-address movw $isrSVC, %ss:0(%di) # entry-point offset’s loword movw $privCS, %ss:2(%di) # selector for code-segment movw $0xEF00, %ss:4(%di) # Gate-Type: 386 Trap-Gate movw $0x0000, %ss:6(%di) # entry-point offset’s hiword

26 Using our jump-table isrSVC: # service-number is found in EAX
cmp $NR_SYS_CALLS, %eax jb idok xor %eax, %eax idok: jmp *sys_call_table(, eax, 4)

27 Our ‘exit’ service When the application invokes the ‘exit’ system-call, our mini ‘operating system’ should leave protected-mode and return back to our boot-loader program The ‘exit-code’ parameter (in %ebx) may just as well be discarded (since this isn’t yet a multitasking operating-system)

28 Our ‘write’ service We only implement writing to the STDOUT device (i.e., the video display console) For most characters in the user’s buffer, we just write the ascii-code (and standard display-attribute) directly to video memory at the current cursor-location and advance the cursor (scrolling the screen if needed) Special ascii control-codes (‘\n’, ‘\r’, ‘\b’) are treated differently, as on a TTY device

29 In-Class Exercise The ‘manydots.s’ demo (to be used with Project #1) uses the ‘read’ system-call (in addition to the ‘write’ and ‘exit’ services) However, you could still ‘execute’ it, using our ‘tryelf32.s’ mini operating-system, by letting the ‘read’ service simply “do nothing” (or return with some kind of “hard-coded” buffer-contents) You just need to modify the LDT descriptors so they’ll conform to ELF sections in ‘manydots.o’


Download ppt "Interfacing with ELF files"

Similar presentations


Ads by Google