Presentation is loading. Please wait.

Presentation is loading. Please wait.

July-10-2002 Terry Jones, Integrated Computing & Communications Dept Fast-OS.

Similar presentations


Presentation on theme: "July-10-2002 Terry Jones, Integrated Computing & Communications Dept Fast-OS."— Presentation transcript:

1 July-10-2002 Terry Jones, Integrated Computing & Communications Dept trj@llnl.gov Fast-OS

2 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Introduction – Today’s landscape – Directions Problem Areas Ripe for Investigation – Parallel Aware Scaling – Parallel Aware Memory Management – Metrics for evaluating system software Why would anyone want to muck with AIX – Bottom-up and Top-down Approaches – Why AIX? – How AIX? Conclusion

3 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Introduction – Today’s landscape – Directions Problem Areas Ripe for Investigation – Parallel Aware Scaling – Parallel Aware Memory Management – Metrics for evaluating system software Why would anyone want to muck with AIX – Bottom-up and Top-down Approaches – Why AIX? – How AIX? Conclusion

4 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Parallel applications need to span thousands of nodes Architectures are adding more processor state Applications are not “mission critical” Both interrupts and busy-waiting are bad Cache effects (processor affinity) cannot be ignored Two modes:Capability mode (jobs are dedicated) Capacity mode (jobs may space-share machine)

5 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Continue to move from a monolithic operating system which communicates via shared-memory TO a decentralized design which communicates via efficient messages Small kernel & process level managers –Modularity –Fault-tolerance –Extensibility Question: How much should system software offer in terms of features? Answer: Everything required, and as much desired as possible

6 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Introduction – Today’s landscape – Directions Problem Areas Ripe for Investigation – Parallel Aware Scaling – Parallel Aware Memory Management – Metrics for evaluating system software Why would anyone want to muck with AIX – Bottom-up and Top-down Approaches – Why AIX? – How AIX? Conclusion

7 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Add “parallel awareness” – CPU resource (local/global program context, scheduling) – Memory resource (demand paging, address space extent) – Metrics – Other possibilities: Fault tolerance/Membership services Re-visit where we insert boundaries (e.g. boundary between kernel and user-level code)

8 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Spatial Scheduling –Assign processes to nodes –For example, batch schedulers & gang-schedulers –Coarse grain view of work to be done Temporal Scheduling –For example, native operating system scheduling –Fine grain view of work to be done (e.g. efficient pthread level scheduling) –Lack necessary global view Coscheduling

9 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Even on the most bare-bones operating systems, there can be more runnable processes than processors Many parallel algorithms are extremely sensitive to serializations A first order goal is to maximize the overlap of competing (interfering) processes during a parallel application.

10 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Provide as much “memory” as possible with as little pain as possible Memory systems are becoming more complex Improved mechanisms to counter false-sharing.

11 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS External storage (secondary & networked) will continue to exceed local memory Memory requirements for certain simulations are almost unbounded Removing constraints on memory is very desirable, but the cost of a page-fault is too much to have hidden from an application Default process level manager provide page-cache management as in Stanford DASH.

12 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Thought to preclude or make more difficult OS bypass communications An application cannot know the amount of physical memory it has available An application cannot efficiently control the contents of the physical memory allocated to it An application cannot control the read-ahead, writeback and discarding of pages within it’s physical memory.

13 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS An aid for reaching agreement on what we want A quantitative measure of different approaches Compared to the scheduler work and the virtual memory work, may be the most difficult

14 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Introduction – Today’s landscape – Directions Problem Areas Ripe for Investigation – Parallel Aware Scaling – Parallel Aware Memory Management – Metrics for evaluating system software Why would anyone want to muck with AIX – Bottom-up and Top-down Approaches – Why AIX? – How AIX? Conclusion

15 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Bottom-up – Start with a clean-slate – Add features as the need arises – Settle on a reasonable boundary Top-down – Start with a full-featured implementation – Remove the unnecessary cruft – Settle on a reasonable boundary

16 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS AIX is ubiquitous in supercomputer centers AIX already has extensive capabilities –Not required to build everything before we try anything AIX is mature (read: is not in radical change mode) AIX scalability (32-way with AIX 5.x)

17 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS In close conjunction with IBM –Expect successes to payoff in IBM products Done in an operating system independent manner –Findings apropos and available to other operating systems Evaluated with real applications on very large machines

18 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS Introduction – Today’s landscape – Directions Problem Areas Ripe for Investigation – Parallel Aware Scaling – Parallel Aware Memory Management – Metrics for evaluating system software Why would anyone want to muck with AIX – Bottom-up and Top-down Approaches – Why AIX? – How AIX? Conclusion

19 Terry Jones / 2002.07.10 July 2002 FastOS Fast-OS New needs arising from today’s parallel machines pose new challenges for system software Among the key needs which emerge... –Parallel aware scheduling –Improved memory management –Metrics for evaluating operating systems These can be investigated from a bottom-up approach, or a top-down approach, or both AIX is a reasonable choice for a top-down approach This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48.


Download ppt "July-10-2002 Terry Jones, Integrated Computing & Communications Dept Fast-OS."

Similar presentations


Ads by Google