Presentation is loading. Please wait.

Presentation is loading. Please wait.

Synchronization and Scheduling in Multiprocessor Operating Systems

Similar presentations


Presentation on theme: "Synchronization and Scheduling in Multiprocessor Operating Systems"— Presentation transcript:

1 Synchronization and Scheduling in Multiprocessor Operating Systems
Chapter 10 Synchronization and Scheduling in Multiprocessor Operating Systems Copyright © 2008

2 Introduction Architecture of Multiprocessor Systems
Issues in Multiprocessor Operating Systems Kernel Structure Process Synchronization Process Scheduling Case Studies Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 2

3 Architecture of Multiprocessor Systems
Performance of uniprocessor systems depends on CPU and memory performance, and Caches Further improvements in system performance can be obtained only by using multiple CPUs Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 3

4 Architecture of Multiprocessor Systems (continued)
Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 4

5 Architecture of Multiprocessor Systems (continued)
Use of a cache coherence protocol is crucial to ensure that caches do not contain stale copies of data Snooping-based approach (bus interconnection) CPU snoops on the bus to analyze traffic and eliminate stale copies Write-invalidate variant At a write, CPU updates memory and invalidates copies in other caches Directory-based approach Directory contains information about copies in caches TLB coherence is an analogous problem Solution: TLB shootdown action Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 5

6 Architecture of Multiprocessor Systems (continued)
Multiprocessor Systems are classified according to the manner of associating CPUs and memory units Uniform memory access (UMA) architecture Previously called tightly coupled multiprocessor Also called symmetrical multiprocessor (SMP) Examples: Balance system and VAX 8800 Nonuniform memory access (NUMA) architecture Examples: HP AlphaServer and IBMNUMA-Q No-remote-memory-access (NORMA) architecture Example: Hypercube system by Intel Is actually a distributed system (discussed later) Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 6

7 Architecture of Multiprocessor Systems (continued)
Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 7

8 Operating Systems, by Dhananjay Dhamdhere Copyright © 2008

9 SMP Architecture Popularly use a bus or a cross-bar switch as the interconnection network Only one conversation can be in progress over the bus at any time; other conversations are delayed CPUs face unpredictable delays in accessing memory Bus may become a bottleneck With a cross-bar switch, performance is better Switch delays are also more predictable Cache coherence protocols add to the delays SMP systems do not scale well beyond a small number of CPUs Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 9

10 NUMA Architecture Actual performance of a NUMA system depends on the nonlocal memory accesses made by processes Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 10

11 Issues in Multiprocessor Operating Systems
Synchronization and scheduling algorithms should be scalable, so that system performance does not degrade with a growth in its size Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 11

12 Kernel Structure Kernel of a multiprocessor OS (SMP architecture) is called an SMP kernel Any CPU can execute code in the kernel, and many CPUs could do so in parallel Based on two fundamental provisions: Kernel is reentrant CPUs coordinate their activities through synchronization and interprocessor interrupts Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 12

13 Kernel Structure: Synchronization
Mutex locks for synchronization Locking can be coarse-grained or fine-grained Tradeoffs: simplicity vs. loss of parallelism Deadlocks are an issue in fine-grained locking Parallelism can be ensured without substantial locking overhead: Use of separate locks for kernel functionalities Partitioning of the data structures of a kernel functionality Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 13

14 Kernel Structure: Heap Management
Parallelism in heap management can be provided by maintaining several free lists Locking is unnecessary if each CPU has its own free list Would degrade performance Allocation decisions would not be optimal Alternative: separate free lists to hold free memory areas of different sizes CPU locks an appropriate free list Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 14

15 Kernel Structure: Scheduling
Suffers from heavy contention for mutex locks Lrq and Lawt because every CPU needs to set/release these locks while scheduling Alternative: Partition processes into subsets and entrust each subset to a CPU for scheduling Fast scheduling but suboptimal performance An SMP kernel provides graceful degradation Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 15

16 Kernel Structure: NUMA Kernel
CPUs in NUMA systems have different memory access times for local and nonlocal memory Each node in a NUMA system has its own separate kernel Exclusively schedules processes whose address spaces are in local memory of the node Concept can be generalized: An application region ensures good performance of an application. It has A resource partition with one or more CPUs An instance of the kernel Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 16

17 Process Synchronization
Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 17

18 Process Synchronization (continued)
Queued locks may not be scalable In NUMA, spin locks may lead to lock starvation Sleep locks may be preferred to spin locks if the memory or network traffic densities are high Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 18

19 Special Hardware for Process Synchronization
The Sequent Balance system uses a special bus called system link and interface controller (SLIC) for synchronization Special 64-bit register in each CPU in the system Each bit implements a spin lock using SLIC Spinning doesn’t generate memory/network traffic Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 19

20 A Scalable Software Scheme for Process Synchronization
NUMA and NORMA architectures Scalable performance Minimizes synchronization traffic to nonlocal memory units (NUMA) and over network (NORMA) Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20

21 Process Synchronization (continued)
Scheduling aware synchronization Adaptive lock A process waiting for this lock spins if holder of the lock is scheduled to run in parallel Otherwise, the process is preempted and queued as in a queued lock Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 Operating Systems, by Dhananjay Dhamdhere 21

22 Process Scheduling CPU scheduling decisions affect performance
How, when and where to schedule processes Affinity scheduling: schedule a process on a CPU where it has executed in the past Good cache hit ratio Interferes with load balancing across CPUs In SMP kernel CPUs can perform own scheduling Prevents kernel from becoming bottleneck Leads to scheduling anomalies Correcting requires shuffling of processes Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 22

23 Example: Process Shuffling in an SMP Kernel
Process shuffling can be implemented by using the assigned workload table AWT and the interprocessor interrupt (IPI) However, it leads to high scheduling overhead Effect is more pronounced in a system containing a large number of CPUs Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 23

24 Process Scheduling (continued)
Processes of an application should be scheduled on different CPUs at the same time if they use spin locks for synchronization Called coscheduling or gang scheduling A different approach is required when processes exchange messages by using a blocking protocol In some situations, special efforts should be made not to schedule such processes in same time slice Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 24

25 Case Studies Mach Linux SMP Support in Windows
Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 25

26 Mach Mach OS implements scheduling hints
Thread issues hint to influence processor scheduling For example, a hands-off hint to relinquish CPU in favor of a specific thread Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 26

27 Linux Multiprocessing support introduced in 2.0 kernel
Coarse-grained locking was employed Granularity of locks was made finer in later releases Kernel was still nonpreemptible until 2.6 kernel Kernel provides: Spin locks for locking of data structures Reader–writer spin locks Sequence lock Per-CPU data structures to reduce lock contention Other features: hard and soft affinity, load balancing Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 27

28 SMP Support in Windows A hyperthreaded CPU is considered to be several logical processors Spin locks provide mutual exclusion over kernel data A thread holding a spinlock is never preempted Queued spinlock uses a scalable software implementation scheme Uses many free lists of memory for parallel access Process default processor affinity and thread processor affinity together define thread affinity set Ideal processor defines hard affinity for a thread Uses both hard and soft affinity Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 28

29 Summary Multiprocessor OS exploits multiple CPUs in computer to provide high throughput (system), computation speedup (application), and graceful degradation (of OS, when faults occur) Classification of uniprocessors Uniform memory architecture (UMA) Also called Symmetrical multiprocessor (SMP) Nonuniform memory architecture (NUMA) OS efficiently schedules user processes in parallel Issues: kernel structure and synchronization delays Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 29

30 Summary (continued) Multiprocessor OS algorithms must be scalable
Use of special kinds of locks: Spin locks and sleep locks Important scheduling concepts in multiprocessor OSs: Affinity scheduling Coscheduling Process shuffling Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 30


Download ppt "Synchronization and Scheduling in Multiprocessor Operating Systems"

Similar presentations


Ads by Google