Presentation is loading. Please wait.

Presentation is loading. Please wait.

ARM System - On - Chip Architecture2 INTRODUCTION ARM is a RISC processor. It is used for small size and high performance applications. Simple architecture.

Similar presentations


Presentation on theme: "ARM System - On - Chip Architecture2 INTRODUCTION ARM is a RISC processor. It is used for small size and high performance applications. Simple architecture."— Presentation transcript:

1

2 ARM System - On - Chip Architecture2 INTRODUCTION ARM is a RISC processor. It is used for small size and high performance applications. Simple architecture – low power consumption.

3 ARM System - On - Chip Architecture3 TIMELINE (1/2) 1985: Acorn Computer Group manufactures the first commercial RISC microprocessor. 1990: Acorn and Apple participation leads to the founding of Advanced RISC Machines (A.R.M.). 1991: ARM6, First embeddable RISC microprocessor. 1992 – 1994: Various companies use ARM (Sharp, Samsung), while in 1993 ARM7, the first multimedia microprocessor is introduced.

4 ARM System - On - Chip Architecture4 TIMELINE (2/2) 1995: Introduction of Thumb and ARM8. 1996 – 2000: Alcatel, Huindai, Philips, Sony, use ΑRM, while in 1999 η ARM cooperates with Erickson for the development of Bluetooth. 2000 – 2002: ARM’s share of the 32 – bit embedded RISC microprocessor market is 80%. ARM Developer Suite is introduced.

5 THE ARM ARCHITECTURE

6 ARM System - On - Chip Architecture6 GENERAL INFO (1/2) AIM: Simple design Load – store architecture 32 bit data bus 3 addressing modes

7 ARM System - On - Chip Architecture7 GENERAL INFO (2/2) Simple architecture + Simple instruction set + Code density Small size Low power consumption

8 ARM System - On - Chip Architecture8 Registers 32 general purpose registers 7 modes of operation Different set of visible registers and different cpsr control level in each mode.

9 ARM Programming Model r13_und r14_und r14_irq r13_irq SPSR_und r14_abt r14_svc user mode fiq mode svc mode abort mode irq mode undefined mode usable in user mode system modes only r13_abt r13_svc r8_fiq r9_fiq r10_fiq r11_fiq SPSR_irq SPSR_abt SPSR_svc SPSR_fiq CPSR r14_fiq r13_fiq r12_fiq r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 (PC)

10 ARM System - On - Chip Architecture10 CPSR N: Negative Z: Zero C: Carry V: Overflow Q: Saturation (for enhanced DSP instructions) ARM CPSR format

11 ARM System - On - Chip Architecture11 Memory Organization Address bus: 32 – bits 1 word = 32 – bits

12 ARM System - On - Chip Architecture12 Instruction Set Three instruction types Data processing Data transfer Control flow

13 ARM System - On - Chip Architecture13 Supervisor mode In user mode the operating system handles operations outside user privileges. Using “supervisor calls”, the user goes to system level and can perform system functions.

14 ARM System - On - Chip Architecture14 I/O System ARM handles peripherals as “memory mapped devices with interrupt support”. Interrupts: IRQ: normal interrupt FIQ: fast interrupt

15 ARM System - On - Chip Architecture15 Exceptions Exceptions: Interrupts Supervisor Call Traps When an exception takes place: The value of PC is copied to r14_exc The operating mode changes into the respective exception mode. The PC takes the exception handler vector address.

16 ARM programming model r13_und r14_und r14_irq r13_irq SPSR_und r14_abt r14_svc user mode fiq mode svc mode abort mode irq mode undefined mode usable in user mode system modes only r13_abt r13_svc r8_fiq r9_fiq r10_fiq r11_fiq SPSR_irq SPSR_abt SPSR_svc SPSR_fiq CPSR r14_fiq r13_fiq r12_fiq r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 (PC)

17 THE ARM INSTRUCTION SET

18 ARM System - On - Chip Architecture18 Data Processing Instructions (1/2) Arithmetic Operations ADD r0, r1, r2; r0:= r1+r2 and don’t update flags ADDS r0, r1, r2 ; r0:= r1+r2 and update flags Logical Operations AND r0, r1, r2; r0:= r1 AND r2 Register Movement MOV r0, r2 Comparison CMP r1, r2

19 ARM System - On - Chip Architecture19 Data Processing Instructions (2/2) Operands: Immediate operands ADD r3, r3, #1 Shifted register operands: ADD r3, r2, r1, LSL #3 Miscellaneous data processing instructions: Multiplication: MUL r4, r3, r2

20 ARM System - On - Chip Architecture20 Data transfer instructions Load and store instructions: LDR r0, [r1] STR r0, [r1] Offset: LDR r0, [r1,#4] Post – indexed: LDR r0, [r1], #16 Auto – indexed: LDR r0, [r1,#16]! Multiple data transfers: LDMIA r1, {r0,r2,r5}

21 ARM System - On - Chip Architecture21 Examples PRE: r0 = 0x00000000 r1 = 0x00009000 mem32[0x00009000] = 0x01010101 mem32[0x00009004] = 0x02020202 LDR r0, [r1, #4]! POST: r0 = 0x02020202 r1 = 0x00009004

22 ARM System - On - Chip Architecture22 Examples PRE: r0 = 0x00000000 r1 = 0x00009000 mem32[0x00009000] = 0x01010101 mem32[0x00009004] = 0x02020202 LDR r0, [r1, #4] POST: r0 = 0x02020202 r1 = 0x00009000

23 ARM System - On - Chip Architecture23 Examples PRE: r0 = 0x00000000 r1 = 0x00009000 mem32[0x00009000] = 0x01010101 mem32[0x00009004] = 0x02020202 LDR r0, [r1], #4 POST: r0 = 0x01010101 r1 = 0x00009004

24 ARM System - On - Chip Architecture24 Examples mem32[0x80018] = 0x03 mem32[0x80014] = 0x02 mem32[0x80010] = 0x01 r0 = 0x00080010 LDMIA r0!, {r1-r3} r0 = 0x0008001c r1 = 0x00000001 r2 = 0x00000002 r3 = 0x00000003

25 ARM System - On - Chip Architecture25 Examples mem32[0x8001c] = 0x04 mem32[0x80018] = 0x03 mem32[0x80014] = 0x02 mem32[0x80010] = 0x01 r0 = 0x00080010 LDMIB r0!, {r1-r3} r0 = 0x0008001c r1 = 0x00000002 r2 = 0x00000003 r3 = 0x00000004

26 ARM System - On - Chip Architecture26 Conditional execution Instructions can be executed conditionally without braches CMP r2, r3 ;subtract and set flags ADDGE r4, r5, r6 ; if r2>r3 SUBLT r4, r5, r6 ; else

27 ARM System - On - Chip Architecture27 Conditional execution mnemonics

28 ARM System - On - Chip Architecture28 Control flow instructions Branch instruction: B label Conditional branch: BNE label Branch and Link: BL label BLloop… Loop……… MOV PC, r14; επιστροφή

29 ARM System - On - Chip Architecture29 Example 1 AREA ARMex, CODE, READONLY ; Name this block of code ARMex ENTRY ; Mark first instruction to execute start MOV r0, #10 ; Set up parameters MOV r1, #3 ADD r0, r0, r1 ; r0 = r0 + r1 stop MOV r0, #0x18 ; angel_SWIreason_ReportException LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit SWI 0x123456 ; ARM semihosting SWI END ; Mark end of file

30 ARM System - On - Chip Architecture30 Example 2 AREA subrout, CODE, READONLY; Name this block of code ENTRY ; Mark first instruction to execute start MOV r0, #10 ; Set up parameters MOV r1, #3 BL doadd ; Call subroutine stop MOV r0, #0x18 ; angel_SWIreason_ReportException LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit SWI 0x123456 ; ARM semihosting SWI doadd ADD r0, r0, r1 ; Subroutine code MOV pc, lr ; Return from subroutine END ; Mark end of file

31 ARM ORGANIZATION AND IMPLEMENTATION

32 3 – Stage Pipeline (ARM7 – 80MHz) Fetch Decode Execute Throughput: 1 instruction / cycle

33 ARM System - On - Chip Architecture33 5 – stage pipeline (1/2) Program execution time: Ways to reduce : Increase Logic simplification Reduce CPI reduce the number of multicycle instructions.

34 5 – stage pipeline (ARM9- 150MHz) (2/2) Fetch Decode Execute Buffer / Data Write - Back

35 ARM System - On - Chip Architecture35 ARM coprocessor interface ARM supports upto 16 coprocessors, which can be software emulated. Each coprocessor has upto 16 general- purpose registers ARM is a load and store architecture. Coprocessors usually handle on – chip functions, such as cache and memory management.

36 ARCHITECTURAL SUPPORT FOR HIGH – LEVEL LANGUAGES

37 ARM System - On - Chip Architecture37 Floating - point accelerator (1/2) For floating-point operations, ARM has the FPE software emulator and the FPA 10 hardware floating – point accelerator. FPA 10 includes: Coprocessor interface Load / store unit Register bank ( 8 registers 80 – bit ) ALU (adder, mult, div)

38 ARM System - On - Chip Architecture38 Floating - point accelerator (2/2)

39 ARM System - On - Chip Architecture39 APCS (1/2) APCS (ARM Procedure Call Standard) is a set of rules concerning C procedure input and output. Specific use of general purpose registers. (r0 – r4: arguments, r4 – r8 variables, r10 stack limit, etc. ) Procedure I/O: BL Loop … Loop … MOV pc, lr

40 ARM System - On - Chip Architecture40 APCS (2/2) C code void f1(int a) { f2(a); } Assembly code f1LDR r0, [r13] STR r13!, [r14] STR r13!, [r0] BL f2 SUB r13,#4 LDR r13!, r15 Stack pointer0 4 8 16

41 THUMB PROGRAMMER’S MODEL

42 ARM System - On - Chip Architecture42 General information Thumb objective: Code density. Thumb has a 16 – bit instruction set. A subset of the ARM instruction set is coded to a 16–bit space With appropriate use great benefits can be achieved in terms of Power efficiency Enhanced performance

43 ARM System - On - Chip Architecture43 Going in and out of Thumb mode Using the BX instruction, in ARM state: e.g. ΒΧ r0 Commands are assembled as 16 – bit instructions with the appropriate directive If r0[0] is 1, the T bit in the CPSR becomes 1 and the PC is set to the address obtained from the remaining bits of r0. Using the BX instruction from Thumb state, we return to ARM state.

44 ARM System - On - Chip Architecture44 The Thumb programmer’s model Thumb registers

45 ARM System - On - Chip Architecture45 ARM vs. Thumb (1/3) Thumb Upto 70% code size reduction 40% more instructions. 45% faster code with 16-bit memory Requires about 30% less external memory ARM 40% faster code when coupled with a 32-bit memory

46 ARM System - On - Chip Architecture46 ARM vs. Thumb (2/3) If performance is critical: ARM If cost and power consumption are critical: Thumb

47 ARM System - On - Chip Architecture47 ARM and Τhumb interaction A 32 – bit ARM system can go into Thumb mode for specific routines, in order to meet power and memory constraints. A 16 – bit system: Can use an on – chip, 32 – bit memory for ARM state routines, and a 16-bit off – chip memory and Thumb code for the rest of the application.

48 ARM System - On - Chip Architecture48 Example 3 AREA ThumbSub, CODE, READONLY ; Name this block of code ENTRY ; Mark first instruction to execute CODE32 ; Subsequent instructions are ARM header ADR r0, start + 1 ; Processor starts in ARM state, BX r0 ; so small ARM code header used ; to call Thumb main program CODE16 ; Subsequent instructions are Thumb start MOV r0, #10 ; Set up parameters MOV r1, #3 BL doadd ; Call subroutine stop MOV r0, #0x18 ; angel_SWIreason_ReportException LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit SWI 0xAB ; Thumb semihosting SWI doadd ADD r0, r0, r1; Subroutine code MOV pc, lr ; Return from subroutine END ; Mark end of file

49 ARM System - On - Chip Architecture49 Example 4 Implement the following pseudocode in ARM and Thumb assembly. Which is more efficient in terms of execution time and which in terms of code size? If r1>r2 then R3= r4 + r5 R6 = r4 – r5 Else R3= r4 - r5 R6 = r4 + r5

50 ARM System - On - Chip Architecture50 Example 5 Write an ARM assembly program that loads data from memory location 0x40, sets bits 3 to 5, clears bits 0 to 2 and leaves the remaining bits unchanged. Test it using 0xAD as input data

51 ARCHITECTURAL SUPPORT FOR SYSTEM DEVELOPMENT

52 The ARM memory interface A basic ARM memory system

53 ARM System - On - Chip Architecture53 AMBA (1/4) Advanced Microcontroller Bus Architecture Advanced High – Performance Bus Advanced System Bus Advanced Peripheral Bus AMBA objectives: Technology – independence To encourage modular system design

54 ARM System - On - Chip Architecture54 AMBA (2/4) A typical AMBA – based system

55 ARM System - On - Chip Architecture55 AMBA (3/4) AHB bus Burst transaction Split transaction Data bus 64 – 128 bit

56 ARM System - On - Chip Architecture56 AMBA (4/4) AMBA Design Kit (ADK) An environment that assists designers in developing ΑΜΒΑ based components και SoC designs.

57 ARM System - On - Chip Architecture57 Signal Processing Support (1/2) Piccolo DSP coprocessor. Various data memories for maximizing throughput.

58 Signal Processing Support (2/2) Piccolo

59 MEMORY HIERARCHY

60 ARM System - On - Chip Architecture60 Memory hierarchy Larger size Lower speed Memory type SizeSpeed Registers32 – bitA few nsec On – chip cache 8 – 32kbytes 10 nsec Off – chip cache 100 – 200 kbytes 10 – 30 nsec RAMMbytes100 nsec

61 ARM System - On - Chip Architecture61 On – chip memory Necessary for performance Some system prefer RAM to on – chip cache. Simpler, cheaper and less power- hungry.

62 ARM System - On - Chip Architecture62 Cache types Cache types: Unified cache. Separate instruction and data caches. Performance:hit rate – miss rate Compulsory miss: first time and address is accessed Capacity miss: When cache full Conflict miss: Two addresses compete for the same place in the cache

63 ARM System - On - Chip Architecture63 Replacement policy -implementation Least Recently Used (LRU) Least Frequently Used (LFU) Data prediction Fully-associative Direct-mapped Set-associative

64 ARM System - On - Chip Architecture64 Direct – mapped cache (1/2) A line of data stored in a tag of memory

65 ARM System - On - Chip Architecture65 Direct – mapped cache (2/2) Each memory location has a specific place in the cache. Tag and data can be accessed at the same time. Tag RAM smaller than data RAM and has a smaller access time allowing the comparison to complete before accessing the data RAM.

66 2 – way set – associative cache. (1/3)

67 ARM System - On - Chip Architecture67 Set associative cache (2/3) A set – associative cache has a number of sets yielding n – way associative cache. Two addresses that would be competing for the same spot in a direct mapped cache, can be stored in different locations and accessed independently.

68 ARM System - On - Chip Architecture68 Set associative (3/3) Set selection: Random allocation Least recently used (LRU) Round – robin (cyclic)

69 Fully associative (1/2)

70 ARM System - On - Chip Architecture70 Write strategies Write – through All write operations are passed to main memory Write – through with buffered write Write operations are passed to main memory through the write buffer Copy – back (write – back) Write operations update only the cache.

71 ARM System - On - Chip Architecture71 Cache feature summary

72 ARM System - On - Chip Architecture72 ‘Perfect’ cache performance

73 ARM System - On - Chip Architecture73 MMU (1/3) Two memory management approaches: Segmentation Paging

74 ARM System - On - Chip Architecture74 MMU (2/3) Segmented memory management:

75 ARM System - On - Chip Architecture75 MMU (3/3) Paging memory management:

76 ARCHITECTURAL SUPPORT FOR OPERATING SYSTEMS

77 ARM System - On - Chip Architecture77 CP15 On – chip coprocessor for MMU, cache, protection unit control. Control takes place through registers with instructions executed in supervisor mode.

78 ARM System - On - Chip Architecture78 Protection Unit Simpler alternative to the MMU. Requires simpler software and hardware. Does not use translation tables, but 8 protection regions instead.

79 ARM DEVELOPER SUITE

80 ARM System - On - Chip Architecture80 ARMULATOR (1/2) Armulator: Emulator of various ARM processors. Allows project development in C, C++ or Assembly. It includes debugger, compilers, assembler and this entire set is called ARM Developer Suite (ADS).

81 ARM System - On - Chip Architecture81 ARMULATOR (2/2) Possible project options: ARM and Thumb Interworking Mixing C, C++ and Assembly Code for ROM Exception handlers MM

82 ARM System - On - Chip Architecture82 ARMULATOR TUTORIAL CODEWARRIOR ENVIRONMENT

83 ARM System - On - Chip Architecture83

84 ARM System - On - Chip Architecture84

85 ARM System - On - Chip Architecture85

86 ARM System - On - Chip Architecture86

87 ARM System - On - Chip Architecture87

88


Download ppt "ARM System - On - Chip Architecture2 INTRODUCTION ARM is a RISC processor. It is used for small size and high performance applications. Simple architecture."

Similar presentations


Ads by Google