Lecture 3. ARM Instructions Prof. Taeweon Suh Computer Science Education Korea University ECM583 Special Topics in Computer Systems.

Slides:



Advertisements
Similar presentations
ARM versions ARM architecture has been extended over several versions.
Advertisements

Appendix D The ARM Processor
Embedded Systems Architecture
1 ARM Movement Instructions u MOV Rd, ; updates N, Z, C Rd = u MVN Rd, ; Rd = 0xF..F EOR.
Chapter 8: Central Processing Unit
Chapter 2 Instruction Sets 金仲達教授 清華大學資訊工程學系 (Slides are taken from the textbook slides)
Embedded System Design Center ARM7TDMI Microprocessor Data Processing Instructions Sai Kumar Devulapalli.
INSTRUCTION SET ARCHITECTURES
Embedded System Design Center Sai Kumar Devulapalli ARM7TDMI Microprocessor Load and store instruction.
Machine Instructions Operations 1 ITCS 3181 Logic and Computer Systems 2015 B. Wilkinson Slides4-1.ppt Modification date: March 18, 2015.
Introduction to Embedded Systems Intel Xscale® Assembly Language and C Lecture #3.
COMP3221 lec9-logical-I.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lecture 9: C/Assembler Logical and Shift - I
Computer Organization and Architecture
ARM Microprocessor “MIPS for the Masses”.
Computer Organization and Architecture
Computer Organization and Architecture
Topics covered: ARM Instruction Set Architecture CSE 243: Introduction to Computer Architecture and Hardware/Software Interface.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
ARM Instructions I Prof. Taeweon Suh Computer Science Education Korea University.
Embedded System Design Center Sai Kumar Devulapalli ARM7TDMI Microprocessor Thumb Instruction Set.
Topic 8: Data Transfer Instructions CSE 30: Computer Organization and Systems Programming Winter 2010 Prof. Ryan Kastner Dept. of Computer Science and.
ARM Assembly Programming Computer Organization and Assembly Languages Yung-Yu Chuang 2007/11/19 with slides by Peng-Sheng Chen.
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
Topic 10: Instruction Representation CSE 30: Computer Organization and Systems Programming Winter 2011 Prof. Ryan Kastner Dept. of Computer Science and.
Lecture 3. Virtual Platform and ARM Intro. Prof. Taeweon Suh Computer Science Education Korea University ECM586 Special Topics in Embedded Systems.
Lecture 2: Basic Instructions CS 2011 Fall 2014, Dr. Rozier.
Lecture 4. ARM Instructions #1 Prof. Taeweon Suh Computer Science Education Korea University ECM586 Special Topics in Embedded Systems.
Lecture 4. ARM Instructions Prof. Taeweon Suh Computer Science & Engineering Korea University COMP427 Embedded Systems.
1 ARM University Program Copyright © ARM Ltd 2013 Cortex-M0+ CPU Core.
ECS642U Embedded Systems ARM CPU and Assembly Code William Marsh.
ARM7TDMI Processor. 2 The ARM7TDMI processor is a member of the Advanced RISC machine family of general purpose 32-bit microprocessor What does mean ARM7TDMI.
N, Z, C, V in CPSR with Adder & Subtractor Prof. Taeweon Suh Computer Science Education Korea University.
1 Chapter 4 ARM Assembly Language Smruti Ranjan Sarangi Computer Organisation and Architecture PowerPoint Slides PROPRIETARY MATERIAL. © 2014 The McGraw-Hill.
ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM
Lecture 2: Advanced Instructions, Control, and Branching EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer.
Unit-2 Instruction Sets, CPUs
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
Lecture 8: Loading and Storing to Memory CS 2011 Fall 2014, Dr. Rozier.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
Assembly Variables: Registers Unlike HLL like C or Java, assembly cannot use variables – Why not? Keep Hardware Simple Assembly Operands are registers.
Lecture 7. Subtractor Prof. Taeweon Suh Computer Science & Engineering Korea University COSE221, COMP211 Logic Design.
ARM7 TDMI INTRODUCTION.
Instruction Set Architectures Early trend was to add more and more instructions to new CPUs to do elaborate operations –VAX architecture had an instruction.
Ch 5. ARM Instruction Set  Data Type: ARM processors supports six data types  8-bit signed and unsigned bytes  16-bit signed and unsigned half-words.
Intel Xscale® Assembly Language and C. The Intel Xscale® Programmer’s Model (1) (We will not be using the Thumb instruction set.) Memory Formats –We will.
ARM Instruction Set Computer Organization and Assembly Languages Yung-Yu Chuang with slides by Peng-Sheng Chen.
Lecture 6: Decision and Control CS 2011 Spring 2016, Dr. Rozier.
Intel Xscale® Assembly Language and C. The Intel Xscale® Programmer’s Model (1) (We will not be using the Thumb instruction set.) Memory Formats –We will.
ARM Assembly Language Programming
Chapter 4: Introduction to Assembly Language Programming
Introduction to the ARM Instruction Set
ARM Registers Register – internal CPU hardware device that stores binary data; can be accessed much more rapidly than a location in RAM ARM has.
ECE 3430 – Intro to Microcomputer Systems
The Cortex-M3/m4 Embedded Systems: Cortex-M3/M4 Instruction Sets
ECM586 Special Topics in Embedded Systems Lecture 4. ARM Instructions
Chapter 8 Central Processing Unit
ARM Load/Store Instructions
Computer Architecture
The ARM Instruction Set
Computer Organization and Assembly Languages Yung-Yu Chuang 2008/11/17
Branching instructions
ARM Introduction.
Overheads for Computers as Components 2nd ed.
Computer Architecture
Multiply Instructions
Introduction to Assembly Chapter 2
An Introduction to the ARM CORTEX M0+ Instructions
Presentation transcript:

Lecture 3. ARM Instructions Prof. Taeweon Suh Computer Science Education Korea University ECM583 Special Topics in Computer Systems

Korea Univ ARM ( 2

Korea Univ 3 ARM Source: 2008 Embedded SW Insight Conference

Korea Univ ARM Partners 4 Source: 2008 Embedded SW Insight Conference

Korea Univ ARM (as of 2008) 5 Source: 2008 Embedded SW Insight Conference

Korea Univ ARM Processor Portfolio 6 Source: 2008 Embedded SW Insight Conference

Korea Univ Abstraction Abstraction helps us deal with complexity  Hide lower-level detail Instruction set architecture (ISA)  An abstract interface between the hardware and the low-level software interface 7

Korea Univ A Typical Memory Hierarchy in Computer 8 On-Chip Components L2 $ CPU Core Secondary Storage (Disk) Reg File Main Memory (DRAM) ITLB DTLB Speed (cycles): ½’s 1’s 10’s 100’s 10,000’s Size (bytes): 100’s 10K’s M’s G’s T’s Cost: highest lowest L1I (Instr Cache) L1D (Data Cache) lower level higher level

Korea Univ Typical and Essential Instructions Each CPU provides many instructions  It would be confusing and complicated to study all the instructions CPU provides  But, there are essential instructions all the CPUs commonly provide Instruction categories  Arithmetic and Logical (Integer)  Memory Access Instructions Load and Store  Branch 9 R0, R1, R2 … R15 CPSR, SPSR Registers in ARM

Korea Univ Levels of Program Code (ARM) High-level language program (in C) swap (int v[], int k) { int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; } Assembly language program swap:sllR2, R5, #2 addR2, R4, R2 ldrR12, 0(R2) ldrR10, 4(R2) strR10, 0(R2) strR12, 4(R2) bexit Machine (object, binary) code C Compiler Assembler

Korea Univ CISC vs RISC CISC (Complex Instruction Set Computer)  One assembly instruction does many (complex) job  Variable length instruction  Example: x86 (Intel, AMD) RISC (Reduced Instruction Set Computer)  Each assembly instruction does a small (unit) job  Fixed-length instruction  Load/Store Architecture  Example: MIPS, ARM 11

Korea Univ ARM Architecture ARM is RISC (Reduced Instruction Set Computer)  x86 instruction set is based on CISC (Complex Instruction Set Computer) even though internally x86 implements pipeline Suitable for embedded systems  Very small implementation (low price)  Low power consumption (longer battery life) 12

Korea Univ ARM Registers ARM has 31 general purpose registers and 6 status registers (32-bit each) 13

Korea Univ ARM Registers Unbanked registers: R0 ~ R7  Each of them refers to the same 32-bit physical register in all processor modes.  They are completely general- purpose registers, with no special uses implied by the architecture Banked registers: R8 ~ R14  R8 ~ R12 have no dedicated special purposes FIQ mode has dedicated registers for fast interrupt processing  R13 and R14 are dedicated for special purposes for each mode 14

Korea Univ R13, R14, and R15 Some registers in ARM are used for special purposes  R15 == PC (Program Counter) x86 uses a terminology called IP (Instruction Pointer)  “EIP” register  R14 == LR (Link Register)  R13 == SP (Stack Pointer) 15

Korea Univ CPSR Current Program Status Register (CPSR) is accessible in all modes Contains all condition flags, interrupt disable bits, the current processor mode 16

Korea Univ CPSR bits 17

Korea Univ CPSR bits 18

Korea Univ CPSR bits 19 ARM: 32-bit mode Thumb: 16-bit mode Jazelle: Special mode for JAVA acceleration

Korea Univ Interrupt Interrupt is an asynchronous signal from hardware indicating the need for attention or a synchronous event in software indicating the need for a change in execution.  Hardware interrupt causes the processor (CPU) to save its state of execution via a context switch, and begin execution of an interrupt handler.  Software interrupt is usually implemented as instructions in the instruction set, which cause a context switch to an interrupt handler similar to a hardware interrupt. Interrupt is a commonly used technique in computer system for communication between CPU and peripheral devices Operating systems also extensively use interrupt (timer interrupt) for task (process, thread) scheduling 20

Korea Univ Hardware Interrupt in ARM IRQ  Normal Interrupt Request by asserting IRQ pin  Program jumps to 0x0000_0018 FIQ  Fast Interrupt Request by asserting FIQ pin  Has a higher priority than IRQ  Program jumps to 0x0000_001C 21 IRQ FIQ

Korea Univ Software Interrupt in ARM There is an instruction in ARM for software interrupt  SWI instruction Software interrupt is commonly used by OS for system calls  Example: open(), close().. etc 22

Korea Univ Exception Vectors in ARM 23

Korea Univ Exception Priority in ARM 24

Korea Univ ARM Instruction Overview 25 ARM is a RISC machine, so the instruction length is fixed  In the ARM mode, the instructions are 32-bit wide  In the Thumb mode, the instructions are 16-bit wide Most ARM instructions can be conditionally executed  It means that they have their normal effect only if the N (Negative), Z (Zero), C (Carry) and V (Overflow) flags in the CPSR satisfy a condition specified in the instruction If the flags do not satisfy this condition, the instruction acts as a NOP (No Operation) In other words, the instruction has no effect and advances to the next instruction

Korea Univ ARM Instruction Format 26 Memory Access Instructions (Load/Store) Branch Instructions Software Interrupt Instruction Arithmetic and Logical Instructions

Korea Univ Condition Field 27

Korea Univ Data Processing Instructions Move instructions Arithmetic instructions Logical instructions Comparison instructions Multiply instructions 28

Korea Univ Execution Unit in ARM 29 ALU Rn Barrel Shifter Rm Rd Pre-processing No pre-processing N

Korea Univ Move Instructions 30 ALU Rn Barrel Shifter Rm Rd N MOV Move a 32-bit value into a registerRd = N MVN Move the NOT of the 32-bit value into a registerRd = ~ N Syntax: {cond}{S} Rd, N

Korea Univ Move Instructions – MOV MOV loads a value into the destination register (Rd) from another register, a shifted register, or an immediate value  Useful to setting initial values and transferring data between registers  It updates the carry flag (C), negative flag (N), and zero flag (Z) if S bit is set C is set from the result of the barrel shifter 31 * SBZ: should be zeros MOV R0, R0; move R0 to R0, Thus, no effect MOV R0, R0, LSL#3 ; R0 = R0 * 8 MOV PC, R14; (R14: link register) Used to return to caller MOVS PC, R14; PC <- R14 (lr), CPSR <- SPSR ; Used to return from interrupt or exception

Korea Univ MOV Example 32 Before: cpsr = nzcv r0 = 0x0000_0000 r1 = 0x8000_0004 MOVS r0, r1, LSL #1 After: cpsr = nzCv r0 = 0x0000_0008 r1 = 0x8000_0004

Korea Univ Rm with Barrel Shifter 33 MOVS r0, r1, LSL #1 Shift Operation (for Rm)Syntax Immediate#immediate RegisterRm Logical shift left by immediateRm, LSL #shift_imm Logical shift left by registerRm, LSL Rs Logical shift right by immediateRm, LSR #shift_imm Logical shift right by registerRm, LSR Rs Arithmetic shift right by immediate Rm, ASR #shift_imm Arithmetic shift right by registerRm, ASR Rs Rotate right by immediateRm, ROR #shift_imm Rotate right by registerRm, ROR Rs Rotate right with extendRm, RRX Encoded here LSL: Logical Shift Left LSR: Logical Shift Right ASR: Arithmetic Shift Right ROR: Rotate Right RRX: Rotate Right with Extend

Korea Univ Arithmetic Instructions 34 ALU Rn Barrel Shifter Rm Rd N ADC add two 32-bit values with carryRd = Rn + N + carry ADD add two 32-bit valuesRd = Rn + N RSB reverse subtract of two 32-bit valuesRd = N - Rn RSC reverse subtract of two 32-bit values with carry Rd = N – Rn - !C SBC subtract two 32-bit values with carryRd = Rn - N - !C SUB subtract two 32-bit valuesRd = Rn - N Syntax: {cond}{S} Rd, Rn, N

Korea Univ Arithmetic Instructions – ADD ADD adds two operands, placing the result in Rd  Use S suffix to update conditional field  The addition may be performed on signed or unsigned numbers 35 ADD R0, R1, R2 ; R0 = R1 + R2 ADD R0, R1, #256 ; R0 = R ADDS R0, R2, R3,LSL#1 ; R0 = R2 + (R3 << 1) and update flags

Korea Univ Arithmetic Instructions – ADC ADC adds two operands with a carry bit, placing the result in Rd  It uses a carry bit, so can add numbers larger than 32 bits  Use S suffix to update conditional field bit 1 st operand: R4 and R5 64 bit 2 nd operand: R8 and R9 64 bit result: R0 and R1 ADDS R0, R4, R8 ; R0 = R4 + R8 and set carry accordingly ADCS R1, R5, R9 ; R1 = R5 + R9 + (Carry flag)

Korea Univ Arithmetic Instructions – SUB SUB subtracts operand 2 from operand 1, placing the result in Rd  Use S suffix to update conditional field  The subtraction may be performed on signed or unsigned numbers 37 SUB R0, R1, R2 ; R0 = R1 - R2 SUB R0, R1, #256 ; R0 = R SUBS R0, R2, R3,LSL#1 ; R0 = R2 - (R3 << 1) and update flags

Korea Univ Arithmetic Instructions – SBC SBC subtracts operand 2 from operand 1 with the carry flag, placing the result in Rd  It uses a carry bit, so can subtract numbers larger than 32 bits.  Use S suffix to update conditional field bit 1 st operand: R4 and R5 64 bit 2 nd operand: R8 and R9 64 bit result: R0 and R1 SUBS R0, R4, R8 ; R0 = R4 – R8 SBC R1, R5, R9 ; R1 = R5 – R9 - !(carry flag)

Korea Univ Examples 39 Before: r0 = 0x0000_0000 r1 = 0x0000_0002 r2 = 0x0000_0001 SUB r0, r1, r2 After: r0 = 0x0000_0001 r1 = 0x0000_0002 r2 = 0x0000_0001 Before: r0 = 0x0000_0000 r1 = 0x0000_0077 RSB r0, r1, #0 // r0 = 0x0 – r1 After: r0 = 0xFFFF_FF89 r1 = 0x0000_0077 Before: r0 = 0x0000_0000 r1 = 0x0000_0005 ADD r0, r1, r1, LSL#1 After: r0 = 0x0000_000F r1 = 0x0000_0005

Korea Univ Examples 40 Before: cpsr = nzcv r1 = 0x0000_0001 SUBS r1, r1, #1 After: cpsr = nZCv r1 = 0x0000_0000 Why is the C flag set (C = 1)?

Korea Univ Logical Instructions 41 ALU Rn Barrel Shifter Rm Rd N AND logical bitwise AND of two 32-bit valuesRd = Rn & N ORR logical bitwise OR of two 32-bit valuesRd = Rn | N EOR logical exclusive OR of two 32-bit valuesRd = Rn ^ N BIC logical bit clearRd = Rn & ~N Syntax: {cond}{S} Rd, Rn, N

Korea Univ Logical Instructions – AND AND performs a logical AND between the two operands, placing the result in Rd  It is useful for masking the bits 42 AND R0, R0, #3 ; Keep bits zero and one of R0 and discard the rest

Korea Univ Logical Instructions – EOR EOR performs a logical Exclusive OR between the two operands, placing the result in the destination register  It is useful for inverting certain bits 43 EOR R0, R0, #3 ; Invert bits zero and one of R0

Korea Univ Examples 44 Before: r0 = 0x0000_0000 r1 = 0x0204_0608 r2 = 0x1030_5070 ORR r0, r1, r2 After: r0 = 0x1234_5678 Before: r1 = 0b1111 r2 = 0b0101 BIC r0, r1, r2 After: r0 = 0b1010

Korea Univ Comparison Instructions 45 ALU Rn Barrel Shifter Rm Rd N CMN compare negatedFlags set as a result of Rn + N CMP CompareFlags set as a result of Rn – N TEQ test for equality of two 32- bit values Flags set as a result of Rn ^ N TST test bits of a 32-bit valueFlags set as a result of Rn & N Syntax: {cond}{S} Rn, N The comparison instructions update the cpsr flags according to the result, but do not affect other registers After the bits have been set, the information can be used to change program flow by using conditional execution

Korea Univ Comparison Instructions – CMP 46 CMP compares two values by subtracting the second operand from the first operand  Note that there is no destination register  It only update cpsr flags based on the execution result CMP R0, R1;

Korea Univ Comparison Instructions – CMN 47 CMN compares one value with the 2’s complement of a second value  It performs a comparison by adding the 2 nd operand to the first operand  It is equivalent to subtracting the negative of the 2 nd operand from the 1 st operand  Note that there is no destination register  It only update cpsr flags based on the execution result CMN R0, R1;

Korea Univ Comparison Instructions – TST 48 TST tests bits of two 32-bit values by logically ANDing the two operands  Note that there is no destination register  It only update cpsr flags based on the execution result TEQ sets flags by logical exclusive ORing the two operands

Korea Univ Examples 49 Before: cpsr = nzcv r0 = 4 r9 = 4 CMP r0, r9 After: cpsr = nZCv r0 = 4 r9 = 4

Korea Univ Branch Instructions 50 B branchpc = label BL branch with link pc = label lr = address of the next instruction after the BL Syntax: B{cond} label BL{cond} label A branch instruction changes the flow of execution or is used to call a routine  The type of instruction allows programs to have subroutines, if-then-else structures, and loops

Korea Univ B, BL B (branch) and BL (branch with link) are used for conditional or unconditional branch  BL is used for the subroutine (procedure, function) call  To return from a subroutine, use MOV PC, R14; (R14: link register) Used to return to caller Branch target address  Sign-extend the 24-bit signed immediate (2’s complement) to 30-bits  Left-shift the result by 2 bits  Add it to the current PC (actually, PC+8)  Thus, the branch target could be ±32MB away from the current instruction 51

Korea Univ Examples 52 B forward ADD r1, r2, #4 ADD r0, r6, #2 ADD r3, r7, #4 forward: SUB r1, r2, #4 backward: ADD r1, r2, #4 SUB r1, r2, #4 ADD r4, r6, r7 B backward BL my_subroutine CMP r1, #5 MOVEQ r1, #0 ….. My_subroutine: MOV pc, lr // return from subroutine

Korea Univ Memory Access Instructions Load-Store (memory access) instructions transfer data between memory and CPU registers  Single-register transfer  Multiple-register transfer  Swap instruction 53

Korea Univ Single-Register Transfer 54 LDR Load a word into a registerRd ← mem32[address] STR Store a word from a register to memoryRd → mem32[address] LDRB Load a byte into a registerRd ← mem8[address] STRB Store a byte from a register to memoryRd → mem8[address] LDRH Load a half-word into a registerRd ← mem16[address] STRH Store a half-word into a registerRd → mem16[address] LDRSB Load a signed byte into a register Rd ← SignExtend ( mem8[address]) LDRSH Load a signed half-word into a register Rd ← SignExtend ( mem16[address])

Korea Univ LDR (Load Register) 55 LDR loads a word from a memory location to a register  The memory location is specified in a very flexible manner with addressing mode // Assume R1 = 0x0000_2000 LDR R0, [R1] // R0 ← [R1] LDR R0, [R1, #16] // R0 ← [R1+16]; 0x0000_2010

Korea Univ STR (Store Register) 56 STR stores a word from a register to a memory location  The memory location is specified in a very flexible manner with a addressing mode // Assume R1 = 0x0000_2000 STR R0, [R1] // [R1] <- R0 STR R0, [R1, #16] // [R1+16] <- R0

Korea Univ Load-Store Addressing Mode 57 Indexing Method Data Base Address register updated? Example Preindex with writeback Mem[base + offset]Yes (Base + offset)LDR r0, [r1, #4]! Preindex Mem[base + offset]NoLDR r0, [r1, #4] Postindex Mem[base]Yes (Base + offset)LDR r0, [r1], #4 ! Indicates that the instruction writes the calculated address back to the base address register Before: r0 = 0x0000_0000 r1 = 0x0009_0000 Mem32[0x0009_0000] = 0x Mem32[0x0009_0004] = 0x After: r0 ← mem[0x0009_0004] r0 = 0x0202_0202 r1 = 0x0009_0004 LDR r0, [r1, #4]! LDR r0, [r1, #4] LDR r0, [r1], #4 After: r0 ← mem[0x0009_0004] r0 = 0x0202_0202 r1 = 0x0009_0000 After: r0 ← mem[0x0009_0000] r0 = 0x0101_0101 r1 = 0x0009_0004

Korea Univ Multiple Register Transfer – LDM, STM 58 LDM Load multiple registers STM Store multiple registers Syntax: {cond} Rn{!}, ^ Addressing Mode DescriptionStart addressEnd addressRn! IAIncrement AfterRnRn + 4 x N - 4Rn + 4 x N IBIncrement BeforeRn + 4Rn + 4 x N DADecrement afterRn – 4 x N + 4 RnRn – 4 x N DBDecrement BeforeRn – 4 x NRn – 4Rn – 4 x N

Korea Univ Multiple Register Transfer – LDM, STM 59 LDM (Load Multiple) loads general-purpose registers from sequential memory locations STM (Store Multiple) stores general-purpose registers to sequential memory locations

Korea Univ LDM, STM - Multiple Data Transfer  In multiple data transfer, the register list is given in a curly brackets {}  It doesn’t matter which order you specify the registers in They are stored from lowest to highest  A useful shorthand is “-” It specifies the beginning and end of registers 60 STMFD R13! {R0, R1} // R13 is updated LDMFD R13! {R1, R0} // R13 is updated STMFD R13!, {R0-R12} // R13 is updated appropriately LDMFD R13!, {R0-R12} // R13 is updated appropriately

Korea Univ Examples 61 Before: Mem32[0x80018] = 0x3 Mem32[0x80014] = 0x2 Mem32[0x80010] = 0x1 r0 = 0x0008_0010 r1 = 0x0000_0000 r2 = 0x0000_0000 r3 = 0x0000_0000 After: LDMIA r0!, {r1-r3} Mem32[0x80018] = 0x3 Mem32[0x80014] = 0x2 Mem32[0x80010] = 0x1 r0 = 0x0008_001C r1 = 0x0000_0001 r2 = 0x0000_0002 r3 = 0x0000_0003

Korea Univ Stack Operation Multiple data transfer instructions (LDM and STM) are used to load and store multiple words of data from/to main memory 62 IA: Increment After IB: Increment Before DA: Decrement After DB: Decrement Before FA: Full Ascending (in stack) FD: Full Descending (in stack) EA: Empty Ascending (in stack) ED: Empty Descending (in stack) StackOtherDescription STMFASTMIBPre-incremental store STMEASTMIAPost-incremental store STMFDSTMDBPre-decremental store STMEDSTMDAPost-decremental store LDMEDLDMIBPre-incremental load LDMFDLDMIAPost-incremental load LDMEALDMDBPre-decremental load LDMFALDMDAPost-decremental load

Korea Univ SWAP Instruction 63 SWP Swap a word between memory and a register tmp = mem32[Rn] mem32[Rn] = Rm Rd = tmp SWPB Swap a byte between memory and a register tmp = mem8[Rn] mem8[Rn] = Rm Rd = tmp Syntax: SWP{B}{cond} Rd, Rm,

Korea Univ SWAP Instruction 64 SWP swaps the contents of memory with the contents of a register  It is a special case of a load-store instruction  It performs a swap atomically meaning that it does not release the bus unitil it is done with the read and the write  It is useful to implement semaphores and mutual exclusion (mutex) in an OS Before: mem32[0x9000] = 0x1234_5678 r0 = 0x0000_0000 r1 = 0x1111_2222 r2 = 0x0000_9000 SWP r0, r1, [r2] After: mem32[0x9000] = 0x1111_2222 r0 = 0x1234_5678 r1 = 0x1111_2222 r2 = 0x0000_9000

Korea Univ Semaphore Example 65 Spin: MOV r1, =semaphore; // r1 has an address for semaphore MOV r2, #1 SWP r3, r2, [r1] CMP r3, #1 BEQ spin

Korea Univ Miscellaneous but Important Instructions Software interrupt instruction Program status register instructions 66

Korea Univ SWI (Software Interrupt) The SWI instruction incurs a software interrupt  It is used by operating systems for system calls  24-bit immediate value is ignored by the ARM processor, but can be used by the SWI exception handler in an operating system to determine what operating system service is being requested 67 SWI Software interrupt lr_svc (r14) = address of instruction following SWI pc = 0x8 cpsr mode = SVC cpsr ‘I bit = 1 (it masks interrupts) Syntax: SWI{cond} SWI_number To return from the software interrupt, use MOVS PC, R14; PC <- R14 (lr), CPSR <- SPSR

Korea Univ Example 68 Before: cpsr = nzcVqift_USER pc = 0x0000_8000 lr = 0x003F_FFF0 r0 = 0x12 0x0000_8000 SWI 0x After: cpsr = nzcVqIft_SVC spsr = nzcVqift_USER pc = 0x0000_0008 lr = 0x0000_8004 r0 = 0x12 SWI handler example SWI_handler: STMFD sp!, {r0-r12, lr} // push registers to stack LDR r10, [lr, #-4] // r10 = swi instruction BIC r10, r10, #0xff // r10 gets swi number BL interrupt_service_routine LDMFD sp!, {r0-r12, pc}^ // return from SWI hander

Korea Univ Program status register instructions 69 MRS Copy program status register to a general-purpose registerRd = psr MSR Copy a general-purpose register to a program status registerpsr[field] = Rm MSR Copy an immediate value to a program status registerpsr[field] = immediate Syntax: MRS{cond} Rd, MSR{cond} _, Rm MSR{cond} _, #immediate * fields can be any combination of control (c), extension (x), status (s), and flags (f) N ZCVIFTMode Control [7:0]eXtension [15:8]Status[23:16]Flags[31:24]

Korea Univ MSR & MRS 70 MSR: Move the value of a general-purpose register or an immediate constant to the CPSR or SPSR of the current mode MRS: Move the value of the CPSR or the SPSR of the current mode into a general-purpose register To change the operating mode, use the following code MSR CPSR_all, R0 ; Copy R0 into CPSR MSR SPSR_all, R0 ; Copy R0 into SPSR MRS R0, CPSR_all ; Copy CPSR into R0 MRS R0, SPSR_all ; Copy SPSR into R0 // Change to the supervisor mode MRS R0,CPSR ; Read CPSR BIC R0,R0,#0x1F ; Remove current mode with bit clear instruction ORR R0,R0,#0x13 ; Substitute to the Supervisor mode MSR CPSR_c,R0 ; Write the result back to CPSR

Korea Univ (Assembly) Language There is no golden way to learn language You got to use and practice to get used to it 71

Korea Univ Backup Slides 72

Korea Univ Overflow/Underflow Overflow/Underflow:  The answer to an addition or subtraction exceeds the magnitude that can be represented with the allocated number of bits Overflow/Underflow is a problem in computers because the number of bits to hold a number is fixed  For this reason, computers detect and flag the occurrence of an overflow/underflow. Detection of an overflow/underflow after the addition of two binary numbers depends on whether the numbers are considered to be signed or unsigned 73

Korea Univ Overflow/Underflow in Unsigned Numbers When two unsigned numbers are added, an overflow is detected from the end carry out of the most significant position  If the end carry is ‘1’, there is an overflow. When two unsigned numbers are subtracted, an underflow is detected when the end carry is “0” 74

Korea Univ Subtraction of Unsigned Numbers Unsigned number is either positive or zero  There is no sign bit  So, a n-bit can represent numbers from 0 to 2 n - 1 For example, a 4-bit can represent 0 to 15 (=2 4 – 1)  To declare an unsigned number in C language, unsigned int a;  x86 allocates a 32-bit for “unsigned int” Subtraction of unsigned integers (M, N)  M – N in binary can be done as follows: M + (2 n – N) = M – N + 2 n If M ≥ N, the sum produces an end carry, which is 2 n  Subtraction result is zero or a positive number If M < N, the sum does not produce an end carry since it is equal to 2 n – (N – M)  Unsigned Underflow: subtraction result is negative and unsigned number can’t represent negative numbers 75

Korea Univ Overflow/Underflow in Signed Numbers With signed numbers, an overflow/underflow can’t occur for an addition if one number is positive and the other is negative.  Adding a positive number to a negative number produces a result whose magnitude is equal to or smaller than the larger of the original numbers An overflow may occur if two numbers are both positive in addition  When x and y both have sign bits of 0 (positive numbers) If the sum has sign bit of 1, there is an overflow An underflow may occur if two numbers are both negative in addition  When x and y both have sign bits of 1 (negative numbers) If the sum has sign bit of 0, there is an underflow 76

Korea Univ Overflow/Underflow in Signed Numbers (+72) (+57) (+129) What is largest positive number represented by 8-bit? 8-bit Signed number addition (-127) ( -6) (-133) 8-bit Signed number addition What is smallest negative number represented by 8-bit? Slide from H.H.Lee, Georgia Tech

Korea Univ Overflow/Underflow in Signed Numbers So, we can detect overflow/underflow with the following logic  Suppose that we add two k-bit numbers x k-1 x k-2 … x 0 + y k-1 y k-2 … y 0 = s k-1 s k-2 … s 0 Overflow = x k-1 y k-1 s k-1 + x k-1 y k-1 s k-1 There is an easier formula  Let the carry out of the full adder adding two numbers be c k-1 c k-2 … c 0 Overflow = c k-1 + c k-2  If a 0 (c k-2 ) is carried in, the only way that 1 (c k-1 ) can be carried out is if x k- 1 = 1 and y k-1 = 1 Adding two negative numbers results in a non-negative number  If a 1 (c k-2 ) is carried in, the only way that 0 (c k-1 ) can be carried out is if x k- 1 = 0 and y k-1 = 0 Adding two positive numbers results in a negative number 78

Korea Univ Overflow/Underflow Detection in Signed Numbers 79 Full Adder AB Cin Cout S S0 A0B0 Full Adder AB Cin Cout S S1 A1B1 Full Adder AB Cin Cout S S2 A2B2 Full Adder AB Cin Cout S S3 A3B3 Carry Overflow/ Underflow n-bit Adder/Subtractor Overflow/ Underflow Cn Cn-1 Slide from H.H.Lee, Georgia Tech

Korea Univ Recap Unsigned numbers  Overflow could occur when 2 unsigned numbers are added An end carry of “1” indicates an overflow  Underflow could occur when 2 unsigned numbers are subtracted An end carry of “0” indicates an underflow.  minuend < subtrahend Signed numbers  Overflow could occur when 2 signed positive numbers are added  Underflow could occur when 2 signed negative numbers are added  Overflow flag indicates both overflow and underflow 80

Korea Univ Recap Binary numbers in 2s complement system are added and subtracted by the same basic addition and subtraction rules as used in unsigned numbers  Therefore, computers need only one common hardware circuit to handle both types (signed, unsigned numbers) of arithmetic The programmer must interpret the results of addition or subtraction differently, depending on whether it is assumed that the numbers are signed or unsigned. 81

Korea Univ ARM Flags In general, computer has several flags (registers) to indicate state of operations such as addition and subtraction  N: Negative  Z: Zero  C: Carry  V: Overflow We have only one adder inside a computer.  CPU does comparison of signed or unsigned numbers by subtraction using adder  CPU sets the flags depending on the result of operation  These flags provide enough information to judge that one is bigger than or less than the other? 82

Korea Univ ARM Flags (Cont) 83