Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Orgnization Rabie A. Ramadan Lecture 9. Cache Mapping Schemes.

Similar presentations


Presentation on theme: "Computer Orgnization Rabie A. Ramadan Lecture 9. Cache Mapping Schemes."— Presentation transcript:

1 Computer Orgnization Rabie A. Ramadan Lecture 9

2 Cache Mapping Schemes

3 Cache memory is smaller than the main memory Only few blocks can be loaded at the cache The cache does not use the same memory addresses Which block in the cache is equivalent to which block in the memory? The processor uses Memory Management Unit (MMU) to convert the requested memory address to a cache address

4 Direct Mapping Assigns cache mappings using a modular approach j = i mod n j cache block number i memory block number n number of cache blocks Memory Cache

5 Example Given M memory blocks to be mapped to 10 cache blocks, show the direct mapping scheme? How do you know which block is currently in the cache?

6 Direct Mapping (Cont.) Bits in the main memory address are divided into three fields. Word  identifies specific word in the block Block  identifies a unique block in the cache Tag  identifies which block from the main memory currently in the cache

7 Example Consider, for example, the case of a main memory consisting of 4K blocks, a cache memory consisting of 128 blocks, and a block size of 16 words. Show the direct mapping and the main memory address format? Tag

8 Example (Cont.)

9 Direct Mapping Advantage Easy Does not require any search technique to find a block in cache Replacement is a straight forward Disadvantages Many blocks in MM are mapped to the same cache block We may have others empty in the cache Poor cache utilization

10 Group Activity Consider, the case of a main memory consisting of 4K blocks, a cache memory consisting of 8 blocks, and a block size of 4 words. Show the direct mapping and the main memory address format?

11 Given the following direct mapping chart, what is the cache and memory location required by the following addresses: 311263 4202

12 Fully Associative Mapping Allowing any memory block to be placed anywhere in the cache A search technique is required to find the block number in the tag field

13 Example We have a main memory with 2 14 words, a cache with 16 blocks, and blocks is 8 words. How many tag & word fields bits? Word field requires 3 bits Tag field requires 11 bits  2 14 /8 = 2048 blocks

14 Which MM block in the cache? Naïve Method: Tag fields are associated with each cache block Compare tag field with tag entry in cache to check for hit. CAM (Content Addressable Memory) Words can be fetched on the basis of their contents, rather than on the basis of their addresses or locations. For example: Find the addresses of all “Smiths” in Dallas.

15 Fully Associative Mapping Advantages Flexibility Utilizing the cache Disadvantage Required tag search Associative search  Parallel search Might require extra hardware unit to do the search Requires a replacement strategy if the cache is full Expensive

16 N-way Set Associative Mapping Combines direct and fully associative mapping The cache is divided into a set of blocks All sets are the same size Main memory blocks are mapped to a specific set based on : s = i mod S s specific to which block i mapped S total number of sets Any coming block is assigned to any cache block inside the set

17 N-way Set Associative Mapping Tag field  uniquely identifies the targeted block within the determined set. Word field  identifies the element (word) within the block that is requested by the processor. Set field  identifies the set

18 N-way Set Associative Mapping

19 Group Activity Compute the three parameters (Word, Set, and Tag) for a memory system having the following specification: Size of the main memory is 4K blocks, Size of the cache is 128 blocks, The block size is 16 words. Assume that the system uses 4-way set- associative mapping.

20 Answer

21 N-way Set Associative Mapping Advantages : Moderate utilization to the cache Disadvantage Still needs a tag search inside the set

22 If the cache is full and there is a need for block replacement, Which one to replace?

23 Cache Replacement Policies Random Simple Requires random generator First In First Out (FIFO) Replace the block that has been in the cache the longest Requires keeping track of the block lifetime Least Recently Used (LRU) Replace the one that has been used the least Requires keeping track of the block history

24 Cache Replacement Policies (Cont.) Most Recently Used (MRU) Replace the one that has been used the most Requires keeping track of the block history Optimal Hypothetical Must know the future

25 Example Consider the case of a 4X8 two-dimensional array of numbers, A. Assume that each number in the array occupies one word and that the array elements are stored column-major order in the main memory from location 1000 to location 1031. The cache consists of eight blocks each consisting of just two words. Assume also that whenever needed, LRU replacement policy is used. We would like to examine the changes in the cache if each of the direct mapping techniques is used as the following sequence of requests for the array elements are made by the processor:

26 Array elements in the main memory

27

28 Conclusion 16 cache miss No single hit 12 replacements Only 4 cache blocks are used

29 Group Activity Do the same in case of fully and 4-way set associative mappings ?

30 Pipelining

31 BasicIdea Basic Idea  Assembly Line  Divide the execution of a task among a number of stages  A task is divided into subtasks to be executed in sequence  Performance improvement compared to sequential execution

32 Pipeline Job 1 2 m tasks 1 2 n Pipeline Stream of Tasks

33 5 Tasks on 4 stage pipeline Task 1 Task 2 Task 3 Task 4 Task 5 1 23 4 5 67 8 Time

34 Speedup t t t 1 2 n Pipeline Stream of m Tasks T (Seq) = n * m * t T(Pipe) = n * t + (m-1) * t Speedup = (n *m)/(n + m -1)

35 Efficiency t t t 1 2 n Pipeline Stream of m Tasks T (Seq) = n * m * t T(Pipe) = n * t + (m-1) * t Efficiency = Speedup/ n =m/(n+m-1)

36 Throughput t t t 1 2 n Pipeline Stream of m Tasks T (Seq) = n * m * t T(Pipe) = n * t + (m-1) * t Throughput = no. of tasks executed per unit of time = m/((n+m-1)*t)

37 Instruction Pipeline  Pipeline stall  Some of the stages might need more time to perform its function.  E.g. the pipeline stalls after I 2  This is called a “Bubble” or “pipeline hazard”

38 Example  Show a Gantt chart for 10 instructions that enter a four-stage pipeline (IF, ID, IE, and IS)?  Assume that I 5 fetching process depends on the results of the I 4 evaluation.

39 Answer

40 Example Delay due to branch

41 Pipeline and Instruction Dependency Instruction Dependency The operation performed by a stage depends on the operation(s) performed by other stage(s). E.g. Conditional Branch  Instruction I 4 can not be executed until the branch condition in I 3 is evaluated and stored.  The branch takes 3 units of time

42 Pipeline and Data Dependency  Data Dependency:  A source operand of instruction I i depends on the results of executing a proceeding I j i > j  E.g.  I j can not be fetched unless the results of I i are saved.

43 Example  ADD R 1, R 2, R 3 R 3  R 1 + R 2  I i  SL R 3, R 3  SL( R 3 )  I i+1  SUB R 5, R 6, R 4 R 4  R 5 – R 6  I i+2  Assume that we have five stages in the pipeline:  IF (Instruction Fetch)  ID (Instruction Decode)  OF (Operand Fetch)  IE (Instruction Execute)  IS (Instruction Store) Show a Gantt chart for this code? Shift Left

44 Answer  R 3 in both I i and I i+1 needs to be written  Therefore, the problem is a Write after Write Data Dependency

45 Stalls Due to Data Dependency  Write after write  Read after write  Write after read  Read after read  does not cause stall

46 Read after write

47 Example Consider the execution of the following sequence of instructions on a five-stage pipeline consisting of IF, ID, OF, IE, and IS. It is required to show the succession of these instructions in the pipeline. Show all types of data dependency? Show the speedup and efficiency?

48 Answer


Download ppt "Computer Orgnization Rabie A. Ramadan Lecture 9. Cache Mapping Schemes."

Similar presentations


Ads by Google