Presentation is loading. Please wait.

Presentation is loading. Please wait.

ARM 2007 Chapter 15 The Future of the Architecture by John Rayfield Optimization Technique in Embedded System (ARM)

Similar presentations


Presentation on theme: "ARM 2007 Chapter 15 The Future of the Architecture by John Rayfield Optimization Technique in Embedded System (ARM)"— Presentation transcript:

1 ARM 2007 liangalei@sjtu.edu.cn Chapter 15 The Future of the Architecture by John Rayfield Optimization Technique in Embedded System (ARM) LiangAlei@SJTU.edu.cn, 2008 April

2 ARM 2007 liangalei@sjtu.edu.cn Overview 1999, ARM plan the future architecture –What’s the future direction of the architecture ? –This consideration results ARMv6. »First implemented as ARM1136J-S Challenges in future –DSP, Video processing for CE device; –Mixture of Little- and Big-endian for TCP/IP; –Sync. methods for multiple processor system; –Power consumption (Computing/mW). Future after ARMv6 –ARM TrustZone

3 ARM 2007 liangalei@sjtu.edu.cn 15.1 Advanced DSP & SIMD support in ARMv6 SIMD –Advantage: Code density, low power: less instruction, less time. –Price for this efficient: reduced flexibility. Light-weight SIMD –Slicing up existing 32-bit datapath into four 8bit or two 16bit slices. »So, speedup is 2 (16-bit) or 4 (8-bit). ARMv6 includes this “lightweight” SIMD. –SADD8, UADD8, etc. –SADD16, UADD16, etc.

4 ARM 2007 liangalei@sjtu.edu.cn ARMv6 Instruction SIMD arithmetic instruction Pack instruction –PKHTB Rd, Rn, Rm // pack halves of Rn, Rm into Rd –PKHBT Complex arithmetic instruction –SMUSD Rt, Ra, Rb // Ra(R)*Rb(R) – Ra(i)*Rb(i) Cryptographic multiplication –UMAAL Rl, Rh, Rm, Rs // Rh/Rl = Rm*Rs+Rh+Rl

5 ARM 2007 liangalei@sjtu.edu.cn 15.2 System support additions to ARMv6 Set current endian –SETEND »// spec = BE or LE And –REV Rd, Rm

6 ARM 2007 liangalei@sjtu.edu.cn 15.2.2 Exception Procession ARMv6 adds the instruction to improve the efficiency for OS to save the return state of an interruption or exception on a stack.

7 ARM 2007 liangalei@sjtu.edu.cn 15.2.3 Multiprocessing Synchronization Primitives As System-on-Chip (SoC) architecture have become more sophisticated. –ARM cores are now often found in devices with many processing units that compete for shared resources.

8 ARM 2007 liangalei@sjtu.edu.cn Atomic Sync Before, SWP instruction is used to keep semaphores coherent. –But, SWP carries the bottleneck. Because SWP is a blocking instruction (lock the BUS until resource released, as spin-lock). LDREX/STREX in ARMv6 –Given system monitor in Memory System. –LDREX load a value from M[x] into Rn, and assuming it will not be changed during it being used. –STREX store a value into M[x], and its return indicates if Mx had been modified between previous LDREX and STREX.(means STREX maybe fail) –Multi-Reads, Exclusive Write.

9 ARM 2007 liangalei@sjtu.edu.cn Organization of ARMv6 Most sophisticated ARM pipeline –8-stage, and separate pipelines for load/store and multiply/accumulate. Hit-under-N-miss –Parallel Load Store Unit (LSU) –Decoupling the pipeline execution from the completion of loads and stores. Physical Cache (instead Virtual Cache) –It will reduce cache flushing when context switching. –Further more, save the power-consumption brought with memory access ( up to ~20% improvement).

10 ARM 2007 liangalei@sjtu.edu.cn

11 15.4 Future Technologies beyond ARMv6 In 2003, ARM made further technology announcements including TrustZone and Thumb-2.

12 ARM 2007 liangalei@sjtu.edu.cn 15.4.1 TrustZone TrustZone is an architecture extension –first introduced in ARM1176JZ-S. Reason –OS are now so complex that it is very hard to verify security and correctness in the software. –The ARM solution is to add new operating “states” when only a small verifiable software kernel will run, and this will provide services to the larger OS. –The microprocessor core then take a role in controlling system peripherals that may be only available to the secure “state” through some new exported signals on the bus interface. TrustZone is most useful in devices that will carrying out content downloads, such as cell phones or other portable devices with network connections.

13 ARM 2007 liangalei@sjtu.edu.cn

14 15.4.2 Thumb-2 Thumb-2 is an architecture extension –designed to increase performance at high code density. –It allows for a blend of 32-bit ARM-like instruction with 16-bit thumb instructions. Thumb-2 is announced in Oct 2003. –will be implemented in ARM1156T2-S. –details are not public by the time of writing.

15 ARM 2007 liangalei@sjtu.edu.cn Summary The ARM architecture is not a static constant. –But is being developed and improved to suite the application required by today’s consumer devices. –Although the ARMv5TE was very successful at adding some DSP support to ARM. ARMv6 extends the DSP support as well as adding support for large multiprocessor system. ARM still concentrates on one of its key benefits—Code Density—and has recently announced the Thumb-2. The new focus on security with TrustZone gives ARM a leading in this area.


Download ppt "ARM 2007 Chapter 15 The Future of the Architecture by John Rayfield Optimization Technique in Embedded System (ARM)"

Similar presentations


Ads by Google