Presentation is loading. Please wait.

Presentation is loading. Please wait.

Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.

Similar presentations


Presentation on theme: "Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be."— Presentation transcript:

1 Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be managed New and evolving programming models – Shifting emphasis from managing cycles to managing data – Programming models require more access to resource management decisions – Hybrid/Mixed programming models (composing applications) Node and Memory structures – On-node RAM, DRAM, Flash – Stacked memory (performance implications for different access patterns) – Explicit cache/hierarchy management – On-node interconnect – Heterogenous cores – On-node power management Global structures – Global address space – Integration of collectives, esp synchronization Resilience (soft errors and damaged cores) HPC OS Sustainability Increasing importance and complexity of resource management

2 Alternate R&D Strategies Evolve an existing OS – Linux, Plan 9, IBM CNK, Kitten Start with an empty emacs buffer Steal components from existing operating systems Partitioning resources – independent management within a partition – Composibility Collective/Global OS – Global address space? It’s time to define the winner

3 Research Agenda HPC Community OS – Define basic structure – Individual groups work on components Expose management of critical resources Simulation to evaluate scalability of resource management strategies Enable co-design of hardware to support resource management Define and implement OS mechanisms that will enable global, autonomic runtime systems

4 Priority Research Direction: Community OS Framework for HPC Systems Key challenges 1.Develop an OS framework specific to the needs of HPC 2.Open system architecture that exposes the management of critical resources 3.Empower developers of libraries and runtime systems 1.HPC applications have unique resource management needs (e.g., memory layout) 2.Anticipated rapid evolution/revolution in architectures and programming models 3.Limited ability to innovate in existing commodity operating systems 4.Sustainability of HPC OS is difficult 1.Context for individual innovation and contribution 2.Common foundation for libraries and runtime environments 1.This will enable full access to hardware resources 2.Timeframe: 2-3 years Summary of research direction Potential impact on software component Potential impact on usability, capability, and breadth of community

5 Priority Research Direction: Scalable System Simulation Key challenges 1.Develop a scalable, full system simulation capability 2.Address multi-scale challenges 3.Adapt techniques that have been used in other branches of computational science 4.Develop common interfaces between simulators 1.Inability to conduct “apples to apples” comparisons in scalable resource management 2.Evolution / revolution in new systems 3.Wide variety of existing simulators 1.Ability to evaluate resource management mechanisms and policies at scale 2.Enable architecture/OS co-design 1.Critical for the OS research/development community 2.Important for runtime community 3.Timeframe: 2-4 years Summary of research direction Potential impact on software component Potential impact on usability, capability, and breadth of community

6 Priority Research Direction: Open System APIs Key challenges 1.Develop community based APIs to expose critical resources 2.Develop prototype runtime environments for common programming models 1.Communication management 2.Thread management 3.Memory management 4.Power management 5.Resilience (fault/failure isolation/management) 1.Provides a fixed point for innovation in API implementation and innovation in the implementation of runtimes (hourglass principle) 2.Differentiation based on performance, not functionality 1.Critical for supporting the development of new programming models 2.Critical for enabling the development of new architectures 3.Timeframe: 3 to 8 years Summary of research direction Potential impact on software component Potential impact on usability, capability, and breadth of community

7 4.1 Operating Systems A Community HPC OS Next Generation Interconnect API Community OS Framework Robust, Scalable System Simulation APIs for energy management API for node resilience Autonomic runtime systems Runtime Environments enabled Prototype implementation of OS Framework


Download ppt "Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be."

Similar presentations


Ads by Google