Download presentation
Presentation is loading. Please wait.
1
Build GCC Cross Compiler for a Specify CPU
Chia-Tsun Wu D
2
Outline Introduction to SoC Motivation and project goal Design a CPU
Tools are used to design CPU hardware CPU Specification CPU Design flow Simulation and Results
3
Outline Build a GCC Cross Compiler Summary GCC structure
Knowledge to port GCC Build Flow Build a GCC Cross Assembler and Cross Linker A simple test program Summary
4
Introduction to SoC SoC: System on a Chip. Highly integrated include:
CPU System Bus Peripherals Co-processor ………… Low cost, low area, high performance.
5
What is SOC? Portable / reusable IP Embedded CPU Embedded Memory
Real World Interfaces (USB, PCI, Ethernet) Software (both on-chip and off) Mixed-signal Blocks Programmable HW (FPGAs) > 500K gates
6
SOC Design Flow System Specs.. HW/SW Partitioning Hardware Descript.
Software Descript. HW Synth. and Configuration Software Gen. & Parameterization Interface Synthesis Configuration Modules Hardware Components HW/SW Interfaces Software Modules HW/SW Integration and Cosimulation Integrated System System Evaluation Design Coverification System Validation
7
Motivation and project goal
SoC is the major trend in recent years CPU is one of the key kernel of SoC design Development environment is the most important to a CPU Goal: Design a simple 32-bit RISC CPU Build a cross assembler and cross linker for a specify CPU Build a cross compiler for a specify CPU
8
Design a CPU Specification 32-bit RISC based CPU
General-purpose register architecture 32-bit (64 Gbyte) addressing 32-bit fixed instruction length (excluding immediate data) MSB first Reset address 0x000ffffc No pipeline, one instruction cycle four clock cycles Instruction fetch Instruction decode and Data fetch Execution Write back No interrupt No timer
9
Registers General purpose register R0~R15
R13: Accumulator R14: memory data pointer R15: stack pointer Program counter (PC) (0x000ffffc after reset) Program status (PS) (Sign flag, Zero flag, oVerflow flag, Carry flag)
10
Instruction formats General: OP Rn1, Rn2 Immediate: OP #data, Rn2
OP: 8 bits n: register number 0000: R0, 1111: R15 Immediate: OP #data, Rn2 #data:32 bit data Branch: OP Addr OP: 16 bit (low byte=0x00) Addr: 32 bits branch address
11
Instruction sets ADD Rn1,Rn2 Machine code:00000000Rn1Rn2
Rn2=Rn1+Rn2 Flag: SZVC ADDC Rn1,Rn2 Machine code: Rn1Rn2 SUB Rn1,Rn2 Machine code: Rn1Rn2 Rn2=Rn2-Rn1 SUBC Rn1,Rn2 Machine code: Rn1Rn2
12
Instruction sets LDI #data,Rn2 Machine code:00001000000Rn2#Data
Flag: MOV Rn1,Rn2 Machine code: Rn1Rn2 Rn2=Rn1 RET Machine code: PC=[SP--] JMP #Addr Machine code: #Addr PC=[Addr]
13
Tools are used Synposis Design Compiler Mentor Graph ModelSim
Synposis Apollo TSMC 0.25um standard cell libraries
14
Post layout simulation
Design Flow CPU Specifications RTL Coding Test bench Function simulation Constrain Design compiler Test bench Gate level simulation Constrain Apollo Test bench Post layout simulation Tape out
15
Test vectors LDI #0x0,R LDI #0x1,R LDI #0x2,R LDI #0x3,R LDI #0x4,R LDI #0x5,R LDI #0x6,R LDI #0x7,R LDI #0x8,R LDI #0x9,R LDI #0xa,R LDI #0xb,R LDI #0xc,R LDI #0xd,R LDI #0xe,R LDI #0xf,R ADD R0,R ADDC R2,R SUB R4,R SUBC R6,R MOV R8,R JMP 0x
16
Simulation result
17
Synthesis results TSMC 0.25um Area:0.35mm*mm Clock:400MHz Power:1.73mW
UMC 0.18um Area:0.19mm*mm Clock:600MHz Power:1mW
18
Build a GCC Cross Compiler
GCC structure Knowledge to port GCC Build Flow Build a GCC Cross Assembler and Cross Linker Build a GCC Cross Compiler A simple test program Summary
19
GCC Execution
20
The Structure of Compiler
21
The Structure of GCC
22
GCC Code Generation Backend machine description pattern match intermediate format (RTL). Machine description like a template. Machine description includes type bit widths, memory alignment instruction patterns, register classes peephole optimization rules
23
GCC Code Generation (cont’d)
24
Example of RTL Adds two 4-byte integer (SImode) operands.
First operand is register Register is also 4-byte integer. Register number is 8. Second operand is constant integer. Value is “123”. Mode is VOIDmode (not given).
25
Templates Used for three purposes: Sample Template for RISC machine:
Generating RTL from parse tree. Generating machine insns from RTL. Specifying parameters about instructions. Sample Template for RISC machine:
26
GCC Porting and Retargeting
Porting to new machines/processors The “Using and Porting the GCC” book and self-contained. Done by describing machine, not how to compile for machine. Using GCC as backend for other language Few well-documented. Few examples. See GNAT、GNU Cobol、Fortran porting. In both case, copy from similar ports.
27
How to port GCC In directory gcc-xxx/gcc/config/machine/ machine.h
Contain C macros that define general attributes of the machine. machine.md Contain RTL expressions that define the instruction set. Input to programs that procude .h and .c files. machine.c Machine-dependent functions; normally things too large to cleanly put into above two files.
28
How to port GCC (cont’d)
29
gcc/config --Architecture characteristic key
H A hardware implementation does not exist. M A hardware implementation is not currently being manufactured. S A Free simulator does not exist. L Integer registers are narrower than 32 bits. Q Integer registers are at least 64 bits wide. N Memory is not byte addressable, and/or bytes are not eight bits. F Floating point arithmetic is not included in the instruction set I Architecture does not use IEEE format floating point numbers C Architecture does not have a single condition code register. B Architecture has delay slots. D Architecture has a stack that grows upward. l Port cannot use ILP32 mode integer arithmetic.
30
gcc/config --Architecture characteristic key
q Port can use LP64 mode integer arithmetic. r Port can switch between ILP32 and LP64 at runtime. (Not necessarily supported by all subtargets.) c Port uses cc0. p Port does not use define_peephole. f Port does not define prologue and/or epilogue RTL expanders. g Port does not define TARGET_ASM_FUNCTION_(PRO|EPI)LOGUE. m Port does not use define_constants. b Port does not use '"* ..."' notation for output template code. d Port uses DFA scheduler descriptions. h Port contains old scheduler descriptions. a Port generates multiple inheritance thunks using TARGET_ASM_OUTPUT_MI(_VCALL)_THUNK. t All insns either produce exactly one assembly instruction, or trigger a define_split. e <arch>-elf is not a supported target. s <arch>-elf is the correct target to use with the simulator in /cvs/src.
31
gcc/config --Architecture characteristic key
Gcc-config.txt
32
define_peephole In addition to instruction patterns the `md' file may contain definitions of machine-specific peephole optimizations. The combiner does not notice certain peephole optimizations when the data flow in the program does not suggest that it should try them. For example, sometimes two consecutive insns related in purpose can be combined even though the second one does not appear to use a register computed in the first one. A machine-specific peephole optimizer can detect such opportunities.
33
define_splits Often you can rewrite the single insn as a list of individual insns, each corresponding to one machine instruction. The compiler splits the insn if there is a reason to believe that it might improve instruction or delay slot scheduling. Splits are evaluated after the combiner pass and before the scheduling passes Splits optimaized the speed and instruction length they are the perfect place to put this intelligence. Ex: If we are loading a small negative constant we can save space and time by loading the positive value and then sign extending it.
34
define_expand On some target machines, some standard pattern names for RTL generation cannot be handled with single insn, but a sequence of RTL insns can represent them. For these target machines, you can write a `define_expand' to specify how to generate the sequence of RTL. A `define_expand' is an RTL expression that looks almost like a `define_insn'; but, unlike the latter, a `define_expand' is used only for RTL generation and it can produce more than one RTL insn. The combiner pass only cares about reducing the number of instructions does not care about instruction lengths or speeds
35
define_insn Push and pop Move
movsi_push movsi_popmove Move movqi_unsigned_register_load movqi_signed_register_load *movqi_internal movhi movhi_unsigned_register_load movhi_signed_register_load *movhi_internal movsi movsi_internal movdi *movdi_insn movsf *movsf_internal *movsf_constant_storeSigned conversions from a smaller integer to a larger integer extendqisi2 extendhisi2 zero_extendqisi2 zero_extendhisi2 Addition add_to_stack addsi3 addsi_regs addsi_small_int addsi_big_int *addsi_for_reload Subtraction subsi3 Multiplication mulsidi3 umulsidi3 mulhisi3 umulhisi3 mulsi3 Negation negsi2 Shifts ashlsi3 ashrsi3 lshrsi3
36
define_insn Logical Operations Comparisons Branches Calls & Jumps
andsi3 iorsi3 xorsi3 one_cmplsi2 Comparisons cmpsi *cmpsi_internal Branches beq bne blt ble bgt bge bltu bleu bgtu bgeu *branch_true *branch_false Calls & Jumps call call_value jump indirect_jump tablejump Function Prologues and Epilogues prologue epilogue return_from_func leave_func enter_func Miscellaneous nop blockage
37
define_insn “addsi_regs”
[(set (match_operand:SI 0 "register_operand" "=r") (plus:SI (match_operand:SI 1 "register_operand" "%0") (match_operand:SI 2 "register_operand" "r")))] "" "add %2, %0" ) ;set value x chapter 9.15 p110 ; value=x ; (plus:m x y) ; x+y with carry out in mode m
38
define_insn “addsi_regs” (cont’d)
; (mach_operand:m n predicate constraint) chapter 10.4 p131 ; if condition(predicate) is true then return n ; n count from 0 ; for each number n, only one match_operand expression ; predicate is a name of C function call. return 0 when failed ; general_operand: check the operand is either a constant, a register, or a memory reference ; register_operand: check the operand is register or not ; immediate_operand: check the operand is immediate data or not ; constraint: describes one kind of operand that is permited ; r: register ; m: any kind of memory operand ; o: only offsetable memory operand ; V: only not offsetable memory operand ; <: memory operand with autodecrement addressing ; >: memory operand with autoincrement addressing ; i: immediate integer operand ; ~9: an operand that matches the specified operand number is allowed.
39
Build a GCC Cross Compiler
Machine Description Configure GCC Configure Binutils Make Make Make install Make install GCC compiler
40
Build a GCC Cross Assembler and Cross Linker
Binutils: Ver 2.14 Configure --target=fr30-elf –prefix=dir Make Make install
41
Build a GCC Cross Compiler
GCC: ver 3.3.1 ../configure --target=fr30-elf --prefix=dir --enable-languages=c Make Make install
42
A simple c to test cross compiler
int test(int i,int j,int k) { int a; int b; a= ; b= ; a+=k; b+=j; a++; b--; i += a + b; return i; } fr30-elf-gcc –S –O2 t.c
43
A simple c to test cross compiler (cont’d)
.file "t.c" .text .p2align 2 .globl test .type test: mov r4, r ; ldi:32 # , r ; ; ldi:32 # , r ; ; add r6, r ; add r5, r ; add r1, r ; add r2, r ; ret .size test, .-test .ident "GCC: (GNU) (cygming special)"
44
A simple c to test cross compiler (cont’d)
45
Summary Study RTL is more important than study MD.
Build cross assembler and cross linker before build cross compiler. There are few data to port GCC as a cross compiler Modify an existing MD is easier than to create a new one. “The main goal of GCC was to make a good, fast compiler for machines in the class that the GNU system aims to run on: 32-bit machines that address 8-bit bytes and have several general registers.” -- Richard Stallman. It seems that to design a new CPU is easier than to build a cross compiler for a GIEE studient.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.