Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cache Physical Implementation Panayiotis Charalambous Xi Research Group Panayiotis Charalambous Xi Research Group.

Similar presentations


Presentation on theme: "Cache Physical Implementation Panayiotis Charalambous Xi Research Group Panayiotis Charalambous Xi Research Group."— Presentation transcript:

1 Cache Physical Implementation Panayiotis Charalambous Xi Research Group Panayiotis Charalambous Xi Research Group

2 Contents Cache Logical View Physical View Case Study – Power 4 L2 Cache Cache Logical View Physical View Case Study – Power 4 L2 Cache

3 Logical Cache Structure n-way associative cache n-elements per set 2 m Sets TagIndex Address (32 bits) = = Data Hit m 32 – m - k … Offset k or

4 Cache Structure

5 Cache Access Steps 1. Decode address 2. Enable the word line 3. Raise the bit lines to high 4. Get the tag value from the tag array 5. Check for tag match 6. Select data output Steps 1. Decode address 2. Enable the word line 3. Raise the bit lines to high 4. Get the tag value from the tag array 5. Check for tag match 6. Select data output

6 Conventional Cache Organization Memory Cell

7 Read: Set bit and bit´ high If the value in the cell is 1, then bit´ is discharged. It the value is 0, then bit is discharged Write: Set bit´ to 0. This forces 1 in the latch.

8 Decoder with Driver

9 Various Components Comparator is xor logic Multiplexer hierarchy for offset. First get block (from output drive), then word, then byte Output Driver Maximum of one input bits high If input 0, then high resistant output … I0I1I7

10 Banking Idea: Support Multiple Cache Accesses Solution: Use multiporting on bit cells (Cost is big) Divide the cache into independent banks

11 Cache Search Steps: 1. Find Bank (bank index) 2. Find Set in Bank (index) 3. Check if data is valid and in the cache (tag match) 4. If all ok return data (block and byte offset), else check lower level memory Steps: 1. Find Bank (bank index) 2. Find Set in Bank (index) 3. Check if data is valid and in the cache (tag match) 4. If all ok return data (block and byte offset), else check lower level memory

12 Case Study - Power 4 Dual Core 64-bit Processors 32KB L1 D-Cache (Per Processor) 2-way associative 128 Bytes Line 64KB L1 I-Cache (Per Processor) Direct Mapped 128 Bytes Line (4 sectors x 32B) ~1.5MB L2 Cache 8-way set associative 128 Bytes line

13 Power4 Floorplan

14 Power4 L2 Logical View Cache Split into 3 Parts, 0.5Mb each Control by 4 Coherency Processors 1 64B Store Queue per Processor

15 Power4 L2U ~512 KB 8 Banks 128 B block size 8-way associative Word lines Bit lines Decoders Address Bus

16 Power4 L2 Cache Block Size C = 512 KB = 2 19 B Block Size = 128 B = 2 7 B 8-way associative 8 Banks per Cache Block Therefore: Set Size is 2 3 *2 7 B= 2 10 B Sets in Cache are 2 19 /2 10 = 2 9 sets Sets per Bank are 2 9 / 2 3 = 2 6 sets L2 Cache Block Size C = 512 KB = 2 19 B Block Size = 128 B = 2 7 B 8-way associative 8 Banks per Cache Block Therefore: Set Size is 2 3 *2 7 B= 2 10 B Sets in Cache are 2 19 /2 10 = 2 9 sets Sets per Bank are 2 9 / 2 3 = 2 6 sets tagindexoffset bank indexset index 64-bit 79 6 3

17 Power4: CACTI Results cacti 524288 128 8 0.8um 8 ---------- CACTI version 3.2 ---------- Cache Parameters: Number of Subbanks: 8 Total Cache Size: 524288 Size in bytes of Subbank: 65536 Number of sets: 64 Associativity: 8 Block Size (bytes): 128 Read/Write Ports: 1 Read Ports: 0 Write Ports: 0 Technology Size: 0.80um Vdd: 4.5V Access Time (ns): 12.3473 Cycle Time (wave pipelined) (ns): 4.97337 Total Power all Banks (nJ): 418.337 Total Power Without Routing (nJ): 198.563 Total Routing Power (nJ): 219.774 Maximum Bank Power (nJ): 63.5175 Best Ndwl (L1): 16 Best Ndbl (L1): 1 Best Nspd (L1): 1 Best Ntwl (L1): 1 Best Ntbl (L1): 1 Best Ntspd (L1): 1 Nor inputs (data): 2 Nor inputs (tag): 2 cacti 524288 128 8 0.8um 8 ---------- CACTI version 3.2 ---------- Cache Parameters: Number of Subbanks: 8 Total Cache Size: 524288 Size in bytes of Subbank: 65536 Number of sets: 64 Associativity: 8 Block Size (bytes): 128 Read/Write Ports: 1 Read Ports: 0 Write Ports: 0 Technology Size: 0.80um Vdd: 4.5V Access Time (ns): 12.3473 Cycle Time (wave pipelined) (ns): 4.97337 Total Power all Banks (nJ): 418.337 Total Power Without Routing (nJ): 198.563 Total Routing Power (nJ): 219.774 Maximum Bank Power (nJ): 63.5175 Best Ndwl (L1): 16 Best Ndbl (L1): 1 Best Nspd (L1): 1 Best Ntwl (L1): 1 Best Ntbl (L1): 1 Best Ntspd (L1): 1 Nor inputs (data): 2 Nor inputs (tag): 2 cacti 524288 128 8 0.8um 16 ---------- CACTI version 3.2 ---------- Cache Parameters: Number of Subbanks: 16 Total Cache Size: 524288 Size in bytes of Subbank: 32768 Number of sets: 32 Associativity: 8 Block Size (bytes): 128 Read/Write Ports: 1 Read Ports: 0 Write Ports: 0 Technology Size: 0.80um Vdd: 4.5V Access Time (ns): 12.434 Cycle Time (wave pipelined) (ns): 4.85483 Total Power all Banks (nJ): 793.381 Total Power Without Routing (nJ): 341.424 Total Routing Power (nJ): 451.957 Maximum Bank Power (nJ): 63.1382 Best Ndwl (L1): 16 Best Ndbl (L1): 1 Best Nspd (L1): 1 Best Ntwl (L1): 1 Best Ntbl (L1): 1 Best Ntspd (L1): 1 Nor inputs (data): 2 Nor inputs (tag): 2

18 CACTI Data Array Ndwl: World line split factor Ndbl: Bit line split factor Nspd: Number of sets mapped to a single word line (sectors) Tag Array Ntwl: World line split factor Ntbl: Bit line split factor Nspt: Number of sets mapped to a single word line (sectors) Increase of Ndbl, Nspd, Ntbl, Nspt requires the increase of sense amplifiers Increase of Ndwl and Ntwl increases the number of word line drivers Data Array Ndwl: World line split factor Ndbl: Bit line split factor Nspd: Number of sets mapped to a single word line (sectors) Tag Array Ntwl: World line split factor Ntbl: Bit line split factor Nspt: Number of sets mapped to a single word line (sectors) Increase of Ndbl, Nspd, Ntbl, Nspt requires the increase of sense amplifiers Increase of Ndwl and Ntwl increases the number of word line drivers

19 Thank You


Download ppt "Cache Physical Implementation Panayiotis Charalambous Xi Research Group Panayiotis Charalambous Xi Research Group."

Similar presentations


Ads by Google