Presentation is loading. Please wait.

Presentation is loading. Please wait.

Access Map Pattern Matching Prefetch: Optimization Friendly Method Yasuo Ishii 1, Mary Inaba 2, and Kei Hiraki 2 1 NEC Corporation 2 The University of.

Similar presentations


Presentation on theme: "Access Map Pattern Matching Prefetch: Optimization Friendly Method Yasuo Ishii 1, Mary Inaba 2, and Kei Hiraki 2 1 NEC Corporation 2 The University of."— Presentation transcript:

1 Access Map Pattern Matching Prefetch: Optimization Friendly Method Yasuo Ishii 1, Mary Inaba 2, and Kei Hiraki 2 1 NEC Corporation 2 The University of Tokyo

2 Background Speed gap between processor and memory has been increased To hide long memory latency, many techniques have been proposed.  Importance of HW data prefetch has been increased Many HW prefetchers have been proposed

3 Conventional Methods Prefetchers uses 1. Instruction Address 2. Memory Access Order 3. Memory Address Optimizations scrambles information  Out-of-Order memory access  Loop unrolling

4 Limitation of Stride Prefetch[Chen+95] Out-of-Order Memory Access Memory Address Space 0xAB04 0xAB03 0xAB05 0xAB06 0xABFF 0xAB04 2 steady Cache Line 0xAB02 AAccess 4 Access 3 Access 1 0xAB01 0xAB00 0xAAFF Access 2 for (int i=0; i

5 Weakness of Conventional Methods Out-of-Order Memory Access  Scrambles memory access order  Prefetcher cannot detect address correlations Loop-Unrolling  Requires additional table entry  Each entry trained slowly Optimization friendly prefetcher is required

6 Access Map Pattern Matching Pattern Matching  Order Free Prefetching  Optimization Friendly Prefetch Access Map  Map-base history  2-bit state map  Each state is attached to cache block

7 State Diagram for Each Cache Block Init  Initialized state Access  Already accessed Prefetch  Issued Pref. Requests Success  Accessed Pref. Data Init Access Success Access Pre- fetch Prefetch

8 Memory Access Pattern Map Corresponding to memory address space  Cache line granularity II Memory Address Space Cache Line Zone Size ・・ ・ A Memory Access Pattern Map Pattern Match Logic SPA

9 Pattern Matching Logic Access Map Shifter Pattern Detector Pipeline Register Prefetch Selector Addr Memory Access Pattern Map IAAAIAIIIA Access Map Shifter 101 IAAAIAAAIII A ・・ ・ Addr ・・・・・・ 1 Priority Encoder & Adder Prefetch Request Feedback Path ・・・・・・ (Addr+2) Access Map Shifter ・・ ・ 00 ・・・・・・ Priority Encoder & Adder IIAIIAAAIAA

10 Parallel Pattern Matching Detects patterns from memory access map  Detects address correlations in parallel  Searches candidates effectively I S IA IAIA A I II IA A Memory Access Pattern Map ・・・・・・ ・・・・・・

11 AMPM Prefetch Memory address space divides into zone Detects hot zone Memory Access Map Table  LRU replacement Pattern Matching Zone Memory Address Space Hot Zone Hot Zone Hot Zone Access Zone Prefetch Request Memory Access Map Table PSAI ・・ ・ PSIA Pattern Match Logic

12 Features of AMPM Prefetcher Pattern Matching Base Prefetching  Map base history  Optimization friendly prefetching Parallel pattern matching  Searches candidates effectively  Complexity-effective implementation

13 Configuration for DPC Competition AMPM Prefetcher  Full-assoc 52 maps, 256 states / map Adaptive Stream Prefetcher [Hur+ 2006]  16 Histograms, 8 Stream Length MSHR Configuration  16 entries for Demand Requests (Default)  32 entries for Prefetch Requests (Additional)

14 Budget Count

15 Methodology Simulation Environment  DPC Framework  Skips first 4000M instructions and evaluate following 100M instructions Benchmark  SPEC CPU2006 benchmark suite  Compile Option: “-O3 -fomit-frame-pointer - funroll-all-loops”

16 IPC Measurement Improves performance by 53% Improves performance in all benchmarks

17 L2 Cache Miss Count Reduces L2 Cache Miss by 76%

18 Related Works Sequence-base Prefetching  Sequential Prefetch [Smith+ 1978]  Stride Prefetching Table [Fu+ 1992]  Markov Predictor [Joseph+ 1997]  Global History Buffer [Nesbit+ 2004] Adaptive Prefetching  AC/DC [Nesbit+ 2004]  Feedback Directed Prefetch [Srinath+ 2007]  Focus Prefetching[Manikantan+ 2008]

19 Conclusion Access Map Pattern Matching Prefetch  Order-Free Prefetch  Optimization friendly prefetching  Parallel Pattern Matching  Complexity-effective implementation Optimized AMPM realizes good performance  Improves IPC by 53%  Reduces L2 cache miss by 76%

20 Spatial Q & A Stride Prefetch Fu Markov Prefetch Joseph GHB Nesbit Feedback based Honjo 2009 Hybrid Hsu Software Support Mowry AC/DC Nesbit Adaptive Stream Hur FDP Srinath Software Sequence-Base (Order Sensitive) Tag Correlation Hu Buffer Block Gindele1977 SMS Somogyi 2006 Sequential Smith RPT Chen Locality Detect Johnson+, 1998 Spatial Pat. Chen Adaptive Hybrid Adaptive Seq. Dahlgren Commercial Processors SuperSPARC R10000 PA7200 Power4 Pentium 4 AMPM Prefetch Ishii HW/SW Integrate Gornish+ 1994


Download ppt "Access Map Pattern Matching Prefetch: Optimization Friendly Method Yasuo Ishii 1, Mary Inaba 2, and Kei Hiraki 2 1 NEC Corporation 2 The University of."

Similar presentations


Ads by Google