Presentation is loading. Please wait.

Presentation is loading. Please wait.

Microprocessor system architectures – ARMv8 Jakub Yaghob.

Similar presentations


Presentation on theme: "Microprocessor system architectures – ARMv8 Jakub Yaghob."— Presentation transcript:

1 Microprocessor system architectures – ARMv8 Jakub Yaghob

2 ARM architecture RISC Large uniform register file Load/store architecture Simple addressing modes Execution states AArch64 x AArch32 Architecture profiles A – application profile R – real-time profile M – microcontroller profile

3 Execution states – AArch64 AArch bit general-purpose registers X30 – procedure link 64-bit PC, SPs, ELRs (exception link registers) bit SIMD registers Single instruction set A64 Exception levels EL0-EL3 64-bit virtual addressing Names each system register with suffix that indicates the lowest EL with access PSTATE (Process state)

4 Execution states – AArch32 AArch bit general purpose registers 32-bit PC, SP, LR (link register) Some registers banked for each execution mode Single ELR (return from Hyp) bit SIMD registers A32 instruction set – fixed length encoding, compatible with ARMv7 T32 instruction set – variable-length, compatible with ARMv7 Thumb 32-bit virtual address CPSR (current program state register)

5 Supported data types, cryptographic extension Integer B, H, W, D, Q Floating point HP, SP, DP IEEE 754 Cryptographic extension Operates on the vector register file AES, SHA1, SHA2-256

6 Memory model The ARM memory model supports Generating an exception on an unaligned memory access Restricting access by applications to specified areas of memory Translating virtual addresses provided by executing instructions into physical addresses AArch64 – 64-bit addressing, TCR (Translation Control Register) determines VA range, EL0+EL1 have 2 independent VA ranges each with its own TCR AArch32 – 32-bit addressing, TCR determines VA range, OS can split VA range into 2 subranges for EL0+EL1 with separate TCR Altering the interpretation of multi-byte data between big-endian and little-endian Controlling the order of accesses to memory Controlling caches and address translation structures Synchronizing access to shared memory by multiple PEs

7 Application architecture – AArch64 31 general-purpose registers R0-R30 64-bit GP registers X0-X30 X30 procedure link 32-bit GP registers W0-W30 Encoding 1Fh for register used as ZR (zero register) 32 vector registers V0-V31 FPCR, FPSR – floating-point status and control register SP 64-bit WSP 32-bit Current SP PC 64-bit

8 Application architecture – AArch64 – vector registers

9 Application architecture – AArch64 – PSTATE Process state for EL0 Data processing flags N – negative Z – zero C – carry V – overflow Exception masking bits D – debug mask A – system error mask I – IRQ mask F – FIQ mask

10 System registers Register naming _Elx, x ∈ {0,1,2,3} General system control registers Debug registers Generic timer registers Performance monitor registers Optional Trace registers Optional Generic Interrupt Controller (GIC) CPU interface registers Optional

11 Software control and EL0 Exception handling Interrupts Memory system aborts Undefined instructions System calls Secure monitor or Hypervisor traps System instructions for control flow WFI – Wait For Interrupt WFE – Wait For Event YIELD – hint Can enter low-power state Cache management Must be enabled by EL1 Debug events BKPT – breakpoint DBG – hint to the debug system HLT – entry to Debug state

12 Caches and memory hierarchy Point of Unification IC, DC see the same copy of a memory Point of Coherency All agents that can access memory are guaranteed to see the same copy

13 Memory types Normal Bulk memory operations, R/W, R/O Device Speculative reads forbidden Additional attributes Gathering  Prevents aggregation of R/W Reordering  Preserves access order and synchronization requirements Early write acknowledgement  Write can be acknowledged other than at the end point Shareability Non-shareable, inner shareable, outer shareable Cacheability Non-cacheable, write-through cacheable, write-back cacheable

14 Alignment Instruction alignment A64 instructions must be word-aligned Data alignment Unaligned access to any Device memory causes an Alignment fault Normal memory SCTLR_ELx.A – configure unaligned access behavior  Generate an Alignment fault  Perform an unaligned access Unaligned access  Not guaranteed to be atomic  Takes a number of additional cycles  Can abort more times for memory exceptions

15 Endian support Instruction endianness A64 instructions are always little-endian Data endianness SCTLR_EL1.E0E – configures endianness for EL0 at EL1 or higher Instructions for reverting data in registers REV16, REV32, REV64

16 Synchronization and semaphores Load-exclusive instructions LDXP, LDXR, LDXRH, LDXRB Store-exclusive instructions STXP, STXR, STXRH, STXRB Clear-exclusive CLREX Should scale on MPS

17 Exception levels Exception levels EL0-EL3 EL0 – unprivileged execution, applications EL1 – OS kernel EL2 – supports virtualization of non-secure operation, hypervisor EL3 – supports switching between two security states (secure state, non-secure state), secure monitor All implementations must include EL0 and EL1 Stack pointer register selection SP_ELx

18 Exception levels

19 Exception mechanism Saved Program Status Register Saves PE state on taking exceptions SPSR_ELx for exception taken to ELx When returning from an exception, PE state restored to the state stored SPSR Exception link registers ELR_ELx holds preferred exception return address

20 Exception vectors Vector Base Address Register (VBAR) Each Elx Defines base address for the table at that ELx

21 System calls SVC Supervisor call exception EL0 calls OS at EL1 HVC Hypervisor call exception For EL1 and higher SMC Secure monitor call exception For EL1 and higher

22 Virtual Memory System Architecture VMSA Provides MMU MMU translates VAs to PAs independently for ELx and security states A64 has 48-bit VA and PA

23 Address translation system VMSAv8-64 Translation Table Base Register (TTBR) Translation Control Register (TCR) Up to four levels of address lookup IA of up to 48 bits OA of up to 48 bits A translation granule size of 4K, 16K, 64K

24 4K translation granule

25 16K translation granule

26 64K translation granule

27 Translation table entries – levels 0-2

28 Translation table entries – level 3

29 Attribute fields

30 MMU faults All types of MMU exceptions Alignment fault Permission fault Translation fault Address size fault Synchronous external abort on a translation table walk Access flag fault TLB conflict abort

31 Translation Lookaside Buffers (TLB) TLB Caches results from translation table walks Global pages Process-specific pages Address Space Identifier (ASID)  Implementation defined size 8 or 16 bits Virtual Machine Identifier (VMID) Concept of locked entries Optional for implementation Maintenance instructions TLBI {,Xt}


Download ppt "Microprocessor system architectures – ARMv8 Jakub Yaghob."

Similar presentations


Ads by Google