Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tornado: Maximizing Locality and Concurrency in a SMMP OS.

Similar presentations


Presentation on theme: "Tornado: Maximizing Locality and Concurrency in a SMMP OS."— Presentation transcript:

1 Tornado: Maximizing Locality and Concurrency in a SMMP OS

2 Contents Types of Locality Locality: A closer look Requirements for locality Design Basics of Tornado Test Results Conclusion

3 Types of Locality* Temporal locality “The concept that a resource that is referenced at one point in time will be referenced again sometime in the near future.” Spatial locality “The concept that the likelihood of referencing a resource is higher if a resource near it has been referenced.” Sequential locality “The concept that memory is accessed sequentially.” *Source: Wikipedia

4 Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 x Processor # 2 x Cache x Memory

5 Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 x Processor # 2 x x Cache Memory

6 Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 x Processor # 2 x x Cache Memory

7 Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 x Processor # 2 x x Cache Memory Notes: No accesses on the bus Because accesses are reads that are satisfied in local caches and no invalidations are sent

8 Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x Cache x Memory bool x = true; while (x) { x = false; // Do other // work… }

9 Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… }

10 Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… } Invalidate block containing x

11 Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… } 2. Read request 1. Cache miss

12 Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… } 2. Read request 1. Cache miss 3. Data

13 Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… } 2. Read request 1. Cache miss 3. Data 4. Write 5. Invalidate block containing x Notes: x becomes a bottleneck, the valid copy keeps jumping from one cache to the other Every write access causing invalidation Almost every read causing a read miss and a bus read

14 Locality: A closer look, Effect of Cache Line Length bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x,y Processor # 2 x Memory bool y = true; while (y) { y = false; // Do other // work… } y 0x0 0x4 x,y Notes: x & y have different addresses but fall into the same cache line (block)!

15 Locality: A closer look, Effect of Cache Line Length bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x,y Processor # 2 x Memory bool y = true; while (y) { y = false; // Do other // work… } y 0x0 0x4 x,y Notes: Read doesn’t cause any problem

16 Locality: A closer look, Effect of Cache Line Length bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x,y Processor # 2 x Memory bool y = true; while (y) { y = false; // Do other // work… } y 0x0 0x4 x,y Notes: Remember: Invalidations are per cache-line/block not word! So we have pretty much the same behavior as the read/write case on a single variable Invalidate block containing x & y

17 Requirements for Locality Spatial and temporal locality Minimizing read/write and write sharing Minimize false sharing Minimize the distance between the accessing processor and the target memory module.

18 Design Basics for Tornado Individual resources are individual objects Clustering objects Protected procedure calls (PPC) Semi-automatic garbage collection

19 Clustered Objects Appears as a single object from the outside but is internally split into reps Each rep handles requests from one or more processors Lots of advantages to this design

20 Clustered Objects (cont.) Per-processor translation tables Partitioned global translation table Default “miss” handlers

21 Protected Procedure Calls Microkernel: relies on servers to carry on part of the OS job As many server threads as there are clients A request is handled on the same processor where it was issued *Image source: Wikipedia

22 Garbage Collection Semi-automatic Makes distinction between temporary and persistent references to objects Eliminates the need for two locks to guarantee existence and locking altogether for read only data

23 Test Results: Effect of rep Count (1)

24 Test Results: Effect of rep Count (2)

25 Test Results: Effect of Cache Associativity

26 Test Results: Tornado vs. Commercial OSes

27 Conclusion Tornado performs much better than many commercial OSes The concept of clustered objects gives it a lot of advantage High locality of data Diminished need for locking Higher degree of sharing, concurrency and modularity


Download ppt "Tornado: Maximizing Locality and Concurrency in a SMMP OS."

Similar presentations


Ads by Google