Download presentation
Presentation is loading. Please wait.
1
December 5, 2001MICRO-34, Austin, Texas Cool-Cache for Hot Multimedia Osman S. Unsal, Raksit Ashok, Israel Koren, C. Mani Krishna, Csaba Andras Moritz Department of Electrical and Computer Engineering University of Massachusetts, Amherst
2
Power Density Source: Fred Pollack, Intel, Micro32
3
Cool-* Project A compiler-enabled power-aware architecture.
4
CPU Power Dissipation by Block Concentrate on L1 data cache IEEE Journal of SSC Nov. 96 Proceedings of ISSCC 94 Cool Chips, Micro-32, 99
5
Cool-Cache Philosophy Speculatively employ static information to simplify memory accesses Leverage multimedia sensitive compile- time partitioning of memory accesses
6
Conventional Cool-Cache Data Static and dynamic Tag Dynamic SRAMBuffer Non-adaptive Tags Single access mechanism Statically Speculative No Tags Multiple access mechanism
7
Cool-Cache Framework Minibuffer Scratchpad –Scalars in media applications have low memory footprint, high access frequency –Partition scalars from non-scalars Hotlines – Non-scalar locations in cache can be speculatively predicted –Simplify memory accesses
8
Cool-Cache Architecture
12
Hotline Approach for (i=0;i<100; i++) { a[i]=a[i+1]; /* both can be mapped to the same hotline */ *p++=b[i]; /* to separate hotlines without alias analysis */ } Based on: Type analysis Control-flow and loop-structure analysis Alias analysis A compile-time fully-predictable approach would require loop- transformations to align accesses to cache line boundaries, has limited scope to simple loops.
13
Hotlines Advantages Speculative prediction does not require static correctness Granularity of speculation is compiler controllable Hotlines does not increase code size
14
Cool-Cache Compiler High-Level Analysis Alias Analysis Hotlines Analysis Cool-Cache Specific Code Generation Footprint Analysis Annotations High-Level Optimizations
15
Benchmarks BenchmarkDescription ADPCMAdaptive differential pulse code modification audio coding EPICImage compression coder based on wavelet decomposition G721Voice compression coder based on G.711,721,723 standards GSMRate speech transcoding coder based on the GSM standard JPEGA lossy image compression coder MESAOpenGL clone: using Mipmap quadilateral texture mapping MPEGLossy motion video compression decoder PEGWITPublic key encryption coder generates a public key RASTASpeech recognition front-end processing
16
Experimental Setup General Parameters 1GHz, 0.35μm, 2.5V IssueIn-order, single L1 D-Cache64K, 2way Minibuffer1K L1 I-Cache32K, 2way L2 CacheNone Main memory100 cycles
17
Minibuffer Footprint ApplicationSize Adpcm0 Epic203 G721 Enc.32 Gsm Enc.146 Jpeg Enc.83 Mpeg Enc.604 Pegwit16 Rasta152 PGP358 Mesa770 Application32reg.16 reg. Epic32.062.4 G7214.538.8 Gsm2.337.2 Jpeg1.146.5 Rasta16.036.0 Scalar memory requirements are low! Percentage of scalars in total memory accesses are high!
18
Impact of Minibuffer
19
Minibuffer Energy Savings
20
Hotlines Hit Rate
21
Cool-Cache Relative Runtime
22
Cool-Cache Energy Savings (32 Registers)
23
Cool-Cache Energy Savings (16 Registers)
24
Conclusion Cool-Cache: a compiler-enabled, power- aware data cache Static speculative approach is powerful
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.