Presentation is loading. Please wait.

Presentation is loading. Please wait.

Another Performance Evaluation of Memory Hierarchy in Embedded Systems

Similar presentations


Presentation on theme: "Another Performance Evaluation of Memory Hierarchy in Embedded Systems"— Presentation transcript:

1 Another Performance Evaluation of Memory Hierarchy in Embedded Systems
Nelson Barnes CPE 631 04/14/03

2 Outline Introduction Related Work Problem Statement Proposed Solutions
Experimental Setup Experimental Results Conclusions Here is the Outline of today’s lecture. First, we shall give a short overview of major trends and breakthroughs in Computer Technology for last 50 years. Then, we will give an answer to the question “What is Computer Architecture”. After these two introductory topics we will consider Measuring and Reporting Performance and major Quantitative Principles of Computer Design. 11/24/2018 UAH, ECE

3 Why is cache design so important in embedded systems?
Introduction Why is cache design so important in embedded systems? 11/24/2018 UAH, ECE

4 Cache Design Parameters
Cache organization Unified vs. Split (Instruction + Data) caches Cache size Cache block (line) size Block placement policy Direct-mapped, fully-associative, set-associative Block replacement policy Random, Least-Recently Used (LRU), Round-robin, Pseudo-LRU, OPT (Optimal) 11/24/2018 UAH, ECE

5 Related Work Mibench vs. NetBench 11/24/2018 UAH, ECE

6 Problem Statement Comprehensive performance evaluation of cache design issues in embedded systems Split versus unified cache Cache placement and size Cache block size Block replacement policy Performance metrics Static measure: the number of cache misses per 1K instructions executed - measured at the end of application execution Dynamic measure: The number of cache misses per 1K instructions executed - measured on every 100K instructions executed 11/24/2018 UAH, ECE

7 Proposed Solution Why use NetBench? 11/24/2018 UAH, ECE

8 Experimental Setup ARM version of the SimpleScalar toolset
Sim-cache Sim-cheetah NetBench Applications include: Micro-Level Programs CRC – Checksum calculation TL – Table lookup IP-Level Programs Route – IPv4 routing DRR – Deficit round robin Application-Level Programs DH – Public key encryption/decryption MD5 – Message digest algorithm (secure signature) 11/24/2018 UAH, ECE

9 Experimental Setup Cache memory setup Cache parameters
Split first level instruction and data Unified first level cache Cache parameters Cache size  ranging from 0.5KB to 32KB Cache associativity  direct mapped, 2-way, 4-way, and 8-way set associative Cache replacement policies  FIFO, Random, LRU, pLRUt, pLRUm, and Optimal Cache block size  32B, 64B 11/24/2018 UAH, ECE

10 Experimental Setup (cont’d)
Instructions ARM Core L1I $ Data L1D $ ARM Core L1U $ Instructions & Data 11/24/2018 UAH, ECE

11 MiBench Experimental Results

12 Data Cache Misses 11/24/2018 UAH, ECE

13 Instruction Cache Misses
11/24/2018 UAH, ECE

14 Unified Cache Misses 11/24/2018 UAH, ECE

15 Dynamic Behavior 11/24/2018 UAH, ECE

16 Dynamic Behavior 11/24/2018 UAH, ECE

17 Replacement Policies 11/24/2018 UAH, ECE

18 Experimental Results NetBench Discussion 11/24/2018 UAH, ECE

19 Conclusions Split caches outperform the equivalent unified cache for relatively small direct mapped caches Unified cache almost always outperforms the split caches for set-associative caches 11/24/2018 UAH, ECE

20 Conclusions Increasing cache associativity reduces the number of cache misses (up to 8-way associative caches) more beneficial for data and unified caches than for instruction caches Pseudo-LRU techniques perform as well as LRU for data caches Random performs the best for instruction caches Relatively significant difference between optimal replacement policy and the best non-optimal policy 11/24/2018 UAH, ECE


Download ppt "Another Performance Evaluation of Memory Hierarchy in Embedded Systems"

Similar presentations


Ads by Google