Download presentation
1
Embedded System Design Center
ARM Processor ARM7TDMI Sai Kumar Devulapalli M.S. Ramaiah School of Advanced Studies
2
The Birth of ARM. As acorn can’t find any processor ready on the market is acceptable for their needs, they want to design new processor. Make new processor need great investment and experience? Luckily the papers from the Berkeley RISC I were designed. After some custom modifications by acorn, new RISC processor was born ! The ARM ( Advanced RISC Machine ).
3
History of ARM Acorn - a Computer Manufacturer 1983: Acorn Limited:
Dominant position in UK personal computer market with Rockwell 6502 (8- Bit) CPU. 16- Bit CISC CPU´s slower than standard memory ports with long interrupt latencies : Acorn designed the first commercial RISC CPU: Acorn Risc Machine (ARM) 1990: Advanced Risc Machine was formed to broaden the market beyond Acorn´s product range
4
History of ARM.. 1990: Startup with 12 engineers and 1 CEO
No patents, no customers, very little money Mid- 1990s: T. I. licensed ARM7 Incorporated into a chip for mobile phones IPO Spring 1998 13 millionaires
5
Reduced Instruction Set Computer
What is RISC/CISC? Reduced Instruction Set Computer Fewer Addressing modes. Fewer Instructions available. For example, ARM, NEC VR series. Complex Instruction Set Computer More Instructions available Many addressing modes. For example, Intel x86.
6
Advantages of RISC? Smaller die size
Simple instructions - simple processor require less transistors. Shorter development time Simple processor take less effort to design. Higher performance? Disadvantages: Complex compiler
7
The ARM programmers´ model
ARM is a Reduced Instruction Set Computer (RISC). It has: a large, regular register file any register can be used for any purpose a load- store architecture instructions which reference memory just move data, they do no processing processing uses values in registers only Fixed length instructions 32 bit Arm Instruction Set 16 bit Thumb Instruction Set
8
Main Features A large set of general purpose registers
A load – store architecture 3- address instructions Conditional execution for every instruction Inclusion of very powerful load-store multiple register instructions Ability to perform general shift & general ALU operation in 1 instruction that executes in 1 clk cycle
9
ARM7TDMI ARM7TDMI Is the current, low-end ARM Core.
It is widely used across a range of application, notably in digital mobile telephones. The origin of the name ARM7TDMI: ARM7- a 3 volt compatible rework of ARM6 32-bit integer core The THUMB 16-bit compressed instruction set. On-chip Debug support, enabling the processor to halt in response to a debug request. An enhanced Multiplier, with higher performance than its predecessors and yielding a full 64-bit result. 4 extra instructions are provided which performs 32 * 32 -> 64 multiplications and 32 * > 64 multiply and accumulate Embedded ICE hardware to give on-chip breakpoint and watch point support.
10
DATA TYPES Byte (8-bit): placed on any byte boundary.
Half-word (16-bit): aligned to two-byte boundaries. Word (32-bit): aligned to four- byte boundaries.
11
Embedded System Design Center
ARM Processor Processor Modes The ARM has six operating modes: User (unprivileged mode under which most tasks run) Fast interrupt request Mode-FIQ (entered when a high priority (fast) interrupt is raised) Interrupt Mode-IRQ (entered when a low priority (normal) interrupt is raised) Supervisor Mode-SVC (entered on reset and when a Software Interrupt instruction is executed) Abort Mode- ABT (used to handle memory access violations) Undefined Mode-UND (used to handle undefined instructions) ARM Architecture Version 4 adds a seventh mode: System Mode-SYS (privileged mode using the same registers as user mode) M.S. Ramaiah School of Advanced Studies
12
ARM programming model r0 r8 r1 r9 r2 r10 CPSR r3 r11 r4 r12 N Z C V r5
31 r2 r10 CPSR r3 r11 r4 r12 N Z C V r5 r13 r6 r14 r7 r15 (PC)
13
Endianness Relationship between bit and byte/word ordering defines endianness: bit 31 bit 0 bit 31 bit 0 byte 3 byte 2 byte 1 byte 0 byte 0 byte 1 byte 2 byte 3 little-endian big-endian
14
The Instruction Pipeline
Embedded System Design Center ARM Processor The Instruction Pipeline The ARM uses a pipeline in order to increase the speed of the flow of instructions to the processor. Allows several operations to be undertaken simultaneously, rather than serially. Rather than pointing to the instruction being executed, the PC points to the instruction being fetched. ARM PC FETCH Instruction fetched from memory Decoding of registers used in instruction Register(s) read from Register Bank Shift and ALU operation Write register(s) back to Register Bank PC - 4 DECODE Each instruction is one word (or 32 bits) Thus each stage in pipeline is one word In other words 4 bytes, hence the offsets of 4 and 8 used here. Most instructions execute in a single cycle Helps to keep the pipeline operating efficiently - only stalls if executing instruction takes several cycles. Thus every cycle, processor can be loading one instruction, decoding another, whilst executing a third. Typically the PC can be assumed to be current instruction plus 8 Cases when not case include When exceptions taken the address stored in LR varies - see Exception Handling module for more details. When PC used in some data processing operations value is unpredictable - see datasheet PC - 8 EXECUTE M.S. Ramaiah School of Advanced Studies
15
CPU Pipeline Stages Fetch
Instruction is fetched from memory and placed in instruction pipeline In data transfer instruction address is sent to address register Decode Instruction is decoded Datapath control signals prepared for the next cycle Instruction owns decode logic but not datapath In data transfer instructions ,ALU holds address component to compute auto- indexing modification if required Execute Instruction owns datapath Register bank is read An operand shifted ALU result generated Result written back into destination register
16
ARM7TDMI core
17
Embedded System Design Center
ARM Processor The Registers ARM has 37 registers in total, all of which are 32-bits long. 30 general purpose registers 5 dedicated saved program status registers 1 dedicated program counter 1 dedicated current program status register However these are arranged into several banks, with the accessible bank being governed by the processor mode. Each mode can access a particular set of r0-r12 registers a particular r13 (the stack pointer) and r14 (link register) r15 (the program counter) cpsr (the current program status register) and privileged modes can also access a particular spsr (saved program status register) M.S. Ramaiah School of Advanced Studies
18
30 general-purpose, 32-bit registers
Fifteen general-purpose registers are visible at any one time, depending on the current processor mode, as r0, r1, ... ,r13, r14. By convention, r13 is used as a stack pointer (sp) in ARM assembly language. The C and C++ compilers always use r13 as the stack pointer. In User mode, r14 is used as a link register (lr) to store the return address when a subroutine call is made. It can also be used as a general-purpose register if the return address is stored on the stack. In the exception handling modes, r14 holds the return address for the exception, or a subroutine return address if subroutine calls are executed within an exception. r14 can be used as a general-purpose register if the return address is stored on the stack.
19
Saved Program Status Registers (SPSRs)
The SPSRs are used to store the CPSR when an exception is taken.One SPSR is accessible in each of the exception-handling modes. User mode and System mode do not have an SPSR because they are not exception handling modes.
20
The program counter(pc)
The program counter is accessed as r15 (or pc). It is incremented by one word (four bytes) for each instruction in ARM state, or by two bytes in Thumb state. Branch instructions load the destination address into the program counter. You can also load the program counter directly using data operation instructions. For example, to return from a subroutine, you can copy the link register into the program counter using: MOV pc,lr During execution, r15 does not contain the address of the currently executing instruction. The address of the currently executing instruction is typically pc– 8 for ARM, or pc– 4 for Thumb.
21
The Current Program Status Register(CPSR)
The CPSR holds: copies of the Arithmetic Logic Unit (ALU) status flags the current processor mode interrupt disable flags. The ALU status flags in the CPSR are used to determine whether conditional instructions are executed or not. On Thumb-capable processors, the CPSR also holds the current processor state (ARM or Thumb).
22
ARM Register Organisation
Embedded System Design Center ARM Processor ARM Register Organisation ARM General registers and Program Counter ARM Program Status Registers r15 (pc) r14 (lr) r13 (sp) r14_svc r13_svc r14_irq r13_irq r14_abt r13_abt r14_undef r13_undef User32 / System FIQ32 Supervisor32 Abort32 IRQ32 Undefined32 cpsr sprsr_fiq spsr_abt spsr_svc spsr_fiq spsr_irq r12 r10 r11 r9 r8 r7 r4 r5 r2 r1 r0 r3 r6 r14_fiq r13_fiq r12_fiq r10_fiq r11_fiq r9_fiq r8_fiq spsr_undef * Shaded indicates Banked Registers M.S. Ramaiah School of Advanced Studies
23
Accessing Registers using ARM Instructions
Embedded System Design Center ARM Processor Accessing Registers using ARM Instructions No breakdown of currently accessible registers. All instructions can access r0-r14 directly. Most instructions also allow use of the PC. Specific instructions to allow access to CPSR and SPSR. M.S. Ramaiah School of Advanced Studies
24
The Program Status Registers (CPSR and SPSRs)
Embedded System Design Center ARM Processor The Program Status Registers (CPSR and SPSRs) Copies of the ALU status flags (latched if the instruction has the "S" bit set). Mode N Z C V 28 31 8 4 I F T Condition bits Condition Code Flags N = Negative result from ALU flag. Z = Zero result from ALU flag. C = ALU operation Carried out V = ALU operation oVerflowed Mode Bits M[4:0] define the processor mode. Interrupt Disable bits. I = 1, disables the IRQ. F = 1, disables the FIQ. T Bit (Architecture v4T only) T = 0, Processor in ARM state T = 1, Processor in Thumb state Current Program Status Register (CPSR) can be considered as an extension of the PC. It contains the: condition code flags, N,Z,C,V. interrupt (FIQ, IRQ) disable bits mode bits T bit Software must never change value in TBIT. If this happens, the processor will enter an unpredictable state. Lower 28 bits known as the "control bits". Bits other than the specified interrupt and mode bits are reserved for future processors, and no program should depend on their values. The condition codes in the CPSR will be preserved or updated depending on the value of the S bit in the instruction. Some instructions do alter condition flags regardless of “S”, ie CMN, CMP, TST and TEQ (return no other result). Mode field bigger than needs to be - just history. Only six modes valid on pre-ARM Architecture v4 chips. SPSRs Also five other PSRs, the Saved Program Status Registers, one for each privilege mode, into which a copy of the CPSR is loaded when an exception occurs. M.S. Ramaiah School of Advanced Studies
25
Embedded System Design Center
ARM Processor Condition Flags Flag Logical Instruction Arithmetic Instruction Negative (N=‘1’) No meaning Bit 31 of the result has been set. Indicates a negative number in signed operations Zero (Z=‘1’) Result is all zeroes Result of operation was zero Carry (C=‘1’) After Shift operation ‘1’ was left in carry flag Result was greater than 32 bits oVerflow (V=‘1’) Result was greater than 31 bits Indicates a possible corruption of the sign bit in signed numbers N flag SUB r0, r1, r2 where r1<r2 Z flag SUB r0, r1, r2 where r1=r2 (also used for results of logical operations) C flag ADD r0, r1, r2 where r1+r2>0xFFFFFFFF V flag ADD r0, r1, r2 where r1+r2>0x7FFFFFFF (if numbers are signed, ALU sign bit will be corrupted) (0x7FFFFFF+0x =0x ) (answer okay for unsigned but wrong for signed) M.S. Ramaiah School of Advanced Studies
26
The Program Counter (R15)
Embedded System Design Center ARM Processor The Program Counter (R15) When the processor is executing in ARM state: All instructions are 32 bits in length All instructions must be word aligned Therefore the PC value is stored in bits [31:2] with bits [1:0] equal to zero (as instruction cannot be halfword or byte aligned). R14 is used as the subroutine link register (LR) and stores the return address when Branch with Link operations are performed, calculated from the PC. Thus to return from a linked branch MOV r15,r14 or MOV pc,lr PC held in register R15 Instructions always a word long, so must be aligned with word boundaries. But addresses specified in bytes, so 2 lsbs will be zero. ie bits [31:2] store word address, bits [1:0] always zero. R15 may : be specified as the base register (Rn) in Load and Store instructions, allowing PC-relative addressing to be used. be the destination register, Rd, for instructions. When the “S” bit is not set the operation overwrites R15 without affecting the CPSR. When the “S” bit is set, in a privileged mode, then SPSR_<mode> is transferred to CPSR at the same time R15 is loaded. M.S. Ramaiah School of Advanced Studies
27
Internal Organization of ARM
Two main blocks: datapath and decoder Register bank (r0 to r15) Two read ports to A-bus/B-bus One write port from ALU-bus Additional read/write ports for program counter r15 Barrel shifter - shift/rotate 2nd operand by any number of bits ALU performs arithmetic/logic functions Address registers/incrementer holds either PC address (with increment) or operand address
28
Datapath activity during data processing instruction
SUB r0, r1, #128; r0 := r Subtract instruction – one operand is a constant Constant 128 encoded in instruction passes through barrel shifter to produce 128*0 ALU operates on the operands and writes the result back to register r0 PC value in address register is incremented and coped back to r15 and the address register
29
Internal Organization
Data register holds read/write data from/to memory Instruction decoder decodes machine code instructions to produce control signals to datapath In single-cycle data processing instructions, data values are read on the A-bus & B-bus, the results from ALU is written back into register bank PC value in address register is incremented and copied back to r15 and the address register – this allows fetching new instructions ahead of time (instruction pre-fetch)
30
Embedded System Design Center
ARM Processor ARM7TDMI Microprocessor Data Processing Instructions Sai Kumar Devulapalli M.S. Ramaiah School of Advanced Studies
31
Data processing Instructions
Embedded System Design Center ARM Processor Data processing Instructions Largest family of ARM instructions, all sharing the same instruction format. Contains: Arithmetic operations Comparisons (no results - just set condition codes) Logical operations Data movement between registers Remember, this is a load / store architecture These instruction only work on registers, NOT memory. They each perform a specific operation on one or two operands. First operand always a register - Rn Second operand sent to the ALU via barrel shifter. We will examine the barrel shifter shortly. M.S. Ramaiah School of Advanced Studies
32
Arithmetic AND logical Instructions: General Format
Embedded System Design Center ARM Processor Arithmetic AND logical Instructions: General Format Opcode{Cond}{S} Rd,Rn,Operand 2 {Cond} - Conditional Execution of instruction E.g. GT=GREATER THAN,LT = LESS THAN {S} - Set the bits in status register after execution. {Operand 2}- various form of the instruction immediate/register/shifting you can easily check all the combinations in the quick references of ARM. M.S. Ramaiah School of Advanced Studies
33
Arithmetic Operations
Embedded System Design Center Arithmetic Operations ARM Processor Operations are: ADD operand1 + operand2 ADC operand1 + operand2 + carry SUB operand1 - operand2 SBC operand1 - operand2 + carry -1 <Sub. with C> RSB operand2 - operand <Reverse Sub> RSC operand2 - operand1 + carry – 1 <Rev.Sub.with C> Syntax: <Operation>{<cond>}{S} Rd, Rn, Operand2 Examples ADD r0, r1, r2 SUBGT r3, r3, #1 RSBLES r4, r5, #5 M.S. Ramaiah School of Advanced Studies
34
Register operand # (I) = 0 indicates that the second operand is specified in register which can also be shifted. #shift Sh Rm Ex: ADD RO, R1, R2, LSL #3 Ex: ADD RO, R1, R2, LSL R3 Rs Sh Rm # Shift : Immediate shift length Rs : Register shift length Sh : Shift type Rm : Register used to hold second operand.
35
Shift operations Guided by “Sh” field in the format
Sh = 00 Logical Shift Left : LSL Operation Sh = 01 Logical Shift Right : LSR Operation Sh = 10 Arithmetic Shift Right : ASR Operation Sh = 11 Rotate Right : ROR Operation With Sh = 11 and #shift = (similar to ROR #0) is used for RRX operation.
36
Embedded System Design Center
ARM Processor Using the Barrel Shifter: The Second Operand Operand 1 Result ALU Barrel Shifter Operand 2 Register, optionally with shift operation applied. Shift value can be either be: 5 bit unsigned integer Specified in bottom byte of another register. Immediate value 8 bit number Can be rotated right through an even number of positions. Assembler will calculate rotate for you from constant. M.S. Ramaiah School of Advanced Studies
37
Embedded System Design Center
ARM Processor Logical Operations Operations are: AND operand1 AND operand2 EOR operand1 EOR operand2 ORR operand1 OR operand2 BIC operand1 AND NOT operand2 [ie bit clear] Syntax: <Operation>{<cond>}{S} Rd, Rn, Operand2 Examples: AND r0, r1, r2 BICEQ r2, r3, #7 EORS r1, r3, r0 M.S. Ramaiah School of Advanced Studies
38
Embedded System Design Center
ARM Processor Comparisons The only effect of the comparisons is to UPDATE THE CONDITION FLAGS. Thus no need to set S bit. Operations are: CMP operand1 - operand2, but result not written CMN operand1 + operand2, but result not written TST operand1 AND operand2, but result not written TEQ operand1 EOR operand2, but result not written Syntax: <Operation>{<cond>} Rn, Operand2 M.S. Ramaiah School of Advanced Studies
39
Embedded System Design Center
ARM Processor Comparisons Examples: CMP r0, r1 CMP R1,Operand e.g. CMP R1,R2 [R1] - [R2] Set the N Z C V in CPSR register. TSTEQ r2, #5 TST R1, Operand2 e.g. TST R1,R2 [R1] AND [R2] M.S. Ramaiah School of Advanced Studies
40
Embedded System Design Center
ARM Processor Data Movement Operations are: MOV Rd, operand2 MVN Rd, (NOT) operand2 Note that these make no use of operand1. Syntax: <Operation>{<cond>}{S} Rd, Operand2 Examples: MOV r0, r1 MVN r0, r1 MOVS r2, #10 MVNEQ r1, #0 M.S. Ramaiah School of Advanced Studies
41
Embedded System Design Center
ARM Processor Quiz Start Stop r0 = r1 ? r0 > r1 ? r0 = r0 - r1 r1 = r1 - r0 Yes No Convert the GCD algorithm given in this flowchart into 1) “Normal” assembler, where only branches can be conditional. 2) ARM assembler, where all instructions are conditional, thus improving code density. The only instructions you need are CMP, B and SUB. M.S. Ramaiah School of Advanced Studies
42
Quiz - Sample Solutions
Embedded System Design Center ARM Processor Quiz - Sample Solutions “Normal” Assembler gcd cmp r0, r ;reached the end? beq stop blt less ;if r0 < r1 sub r0, r0, r1 ;subtract r1 from r0 bal gcd less sub r1, r1, r0 ;subtract r0 from r1 stop ARM Conditional Assembler gcd cmp r0, r ;if r0 > r1 subgt r0, r0, r1 ;subtract r1 from r0 sublt r1, r1, r0 ;else subtract r0 from r1 bne gcd ;reached the end? r0 = r1 = 1 Normal cmp - 1 beq - 3 (4) Cond sub - x - 1 sub - x - 1 bne - x - 1 r0 = 1, r1 = 2 Normal cmp - 1 beq - x - 1 blt - 3 sub - 1 bal - 3 beq - 3 (13) Cond sub - x - 1 bne - 3 sub - x -1 sub - x - 1 bne - x - 1 (10) r0 = 2, r1 = 1 Normal cmp - 1 beq - x - 1 blt -x -1 sub - 1 bal - 3 beq - 3 (11) Cond sub - 1 sub - x - 1 bne - 3 sub -x -1 sub - x -1 bne -x -1 (10) M.S. Ramaiah School of Advanced Studies
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.