March 4, 2003SOS-71 FAST-OS Arthur B. (Barney) Maccabe Computer Science Department The University of New Mexico SOS 7 Durango, Colorado March 4, 2003.

Slides:



Advertisements
Similar presentations
The MPI Forum: Getting Started Rich Graham Oak Ridge National Laboratory.
Advertisements

Ed Duguid with subject: MACE Cloud
Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
University of Chicago Department of Energy The Parallel and Grid I/O Perspective MPI, MPI-IO, NetCDF, and HDF5 are in common use Multi TB datasets also.
SWIM Meeting, Tech-X, October 2010 The Fusion Simulation Program presented at the SWIM meeting on October 28, 2010 JR Cary Tech-X on behalf of the FSP.
System Software Environments Breakout Report June 27, 2002.
Minimal-overhead Virtualization of a Large Scale Supercomputer John R. Lange and Kevin Pedretti, Peter Dinda, Chang Bae, Patrick Bridges, Philip Soltero,
Report on 2002 Fault Tolerance Workshop Patricia D. Hough Computational Sciences and Mathematics Research Department Sandia National Laboratories.
Lecture 12 Page 1 CS 111 Online Devices and Device Drivers CS 111 On-Line MS Program Operating Systems Peter Reiher.
Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.
A 100,000 Ways to Fa Al Geist Computer Science and Mathematics Division Oak Ridge National Laboratory July 9, 2002 Fast-OS Workshop Advanced Scientific.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
July Terry Jones, Integrated Computing & Communications Dept Fast-OS.
Operating Systems Concepts Professor Rick Han Department of Computer Science University of Colorado at Boulder.
1 Mathematical, Information and Computational Sciences FAST-OS Workshop July 9-10, 2002 Fred Johnson.
Strategic Directions in Real- Time & Embedded Systems Aatash Patel 18 th September, 2001.
Why Linux is a Bad Idea as a Compute Node OS (for Balanced Systems) Ron Brightwell Sandia National Labs Scalable Computing Systems Department
Chiba City: A Testbed for Scalablity and Development FAST-OS Workshop July 10, 2002 Rémy Evard Mathematics.
WORK PROGRAMME 2014 – 2015 Topic ICT 9: Tools and Methods for Software Development Odysseas I. PYROVOLAKIS European Commission DG CONNECT Software & Services,
Module 1 – MIS Careers & Intro to Systems Architecture MIS5122: Enterprise Architecture for IT Auditors.
VDBMS Project. EDUMED Project Overview of Nile Project Example of Nile Applications: Traffic Monitoring.
3DAPAS/ECMLS panel Dynamic Distributed Data Intensive Analysis Environments for Life Sciences: June San Jose Geoffrey Fox, Shantenu Jha, Dan Katz,
1 MOLAR: MOdular Linux and Adaptive Runtime support Project Team David Bernholdt 1, Christian Engelmann 1, Stephen L. Scott 1, Jeffrey Vetter 1 Arthur.
March 6, 2003Thomas Sterling - Caltech & JPL 1 Crystal Ball Panel Thomas Sterling California Institute of Technology and NASA Jet Propulsion Laboratory.
Future Internet Research - Global Update Kilnam Chon AsiaFI Steering Group Chair KAIST and Keio University.
Overview of the Computer Resource Team (CRT) Blaise Barney (LLNL) Rob Cunningham (LANL) Barbara Jennings (Sandia) PSAAP Kickoff Meeting July 8, 2008 Albuquerque,
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
1 Intern Project Presentation Connor Richardson Big Data August 4, 2015.
Programming Models & Runtime Systems Breakout Report MICS PI Meeting, June 27, 2002.
Report on Communication Architecture for Clusters (CAC) Workshop Dhabaleswar K. (DK) Panda Department of Computer and Info. Science The Ohio State University.
Crystal Ball Panel ORNL Heterogeneous Distributed Computing Research Al Geist ORNL March 6, 2003 SOS 7.
The Globus Project: A Status Report Ian Foster Carl Kesselman
March 9, 2015 San Jose Compute Engineering Workshop.
Blaise Barney, LLNL ASC Tri-Lab Code Development Tools Workshop Thursday, July 29, 2010 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
DISTRIBUTED COMPUTING. Computing? Computing is usually defined as the activity of using and improving computer technology, computer hardware and software.
OS and System Software for Ultrascale Architectures – Panel Jeffrey Vetter Oak Ridge National Laboratory Presented to SOS8 13 April 2004 ack.
Diskless Checkpointing on Super-scale Architectures Applied to the Fast Fourier Transform Christian Engelmann, Al Geist Oak Ridge National Laboratory Februrary,
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
Innovation and information technology June 20, 2005 Research Efforts toward Non-Stop Services in High End and Enterprise Computing Box Leangsuksun, Associate.
Grant Proposal Writing Workshop The Graduate School University of Arkansas February 5, 2010.
Lawrence Livermore National Laboratory S&T Principal Directorate - Computation Directorate Tools and Scalable Application Preparation Project Computation.
Breakout Group: Debugging David E. Skinner and Wolfgang E. Nagel IESP Workshop 3, October, Tsukuba, Japan.
A View from the Top Al Geist June Houston TX.
System-Directed Resilience for Exascale Platforms LDRD Proposal Ron Oldfield (PI)1423 Ron Brightwell1423 Jim Laros1422 Kevin Pedretti1423 Rolf.
Developer Day Windows Azure June 2012 & October 2012 News Mario Szpuszta Cloud Architect & Technical Evangelist, Microsoft Corp.
Threaded Programming Lecture 2: Introduction to OpenMP.
RDM Survey Survey questions and polling data collected at the LIASA Research Data Management workshop Cape Town 27 March 2014 Event web page:
Distributed Systems Unit – 1 Concepts of DS By :- Maulik V. Dhamecha Maulik V. Dhamecha (M.Tech.)
SensorWare: Distributed Services for Sensor Networks Rockwell Science Center and UCLA.
1 Thrust 5: Secure Wireless Networking Technologies For future generation wireless packet networks, two most important aspects need to be addressed: QoS.
Challenges in the Next Generation Internet Xin Yuan Department of Computer Science Florida State University
1 Cloud Systems Panel at HPDC Boston June Geoffrey Fox Community Grids Laboratory, School of informatics Indiana University
Applications Scaling Panel Jonathan Carter (LBNL) Mike Heroux (SNL) Phil Jones (LANL) Kalyan Kumaran (ANL) Piyush Mehrotra (NASA Ames) John Michalakes.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Presented by Fault Tolerance Challenges and Solutions Al Geist Network and Cluster Computing Computational Sciences and Mathematics Division Research supported.
ORNL is managed by UT-Battelle for the US Department of Energy Musings about SOS Buddy Bland Presented to: SOS20 Conference March 25, 2016 Asheville, NC.
Photos placed in horizontal position with even amount of white space between photos and header Sandia National Laboratories is a multi-program laboratory.
COMPSCI 110 Operating Systems
Andy Wang COP 5611 Advanced Operating Systems
Lei Shu 1 Industrial Security and Wireless Sensor Networks Lab.
Scalable Systems Software for Terascale Computer Centers
EECS 582 Midterm Review Mosharaf Chowdhury EECS 582 – F16.
Computer Science Model Standards
System Calls.
COP 5611: Operating Systems
Sky Computing on FutureGrid and Grid’5000
Sky Computing on FutureGrid and Grid’5000
Connecting and Collaborating
Presentation transcript:

March 4, 2003SOS-71 FAST-OS Arthur B. (Barney) Maccabe Computer Science Department The University of New Mexico SOS 7 Durango, Colorado March 4, 2003

SOS-71 FAST-OS Most applications show poor scaling on very large machines Scaling must be addressed at all levels FAST-OS: Consider OS and runtime Emphasis on near-term systems Application Runtime Operating System Hardware FAST-OS Forum to Address Scalable Technology for runtime and Operating Systems

March 4, 2003SOS-71 FAST-OS People Oversight Committee Fred Johnson, DOE Paul Messina, Cal Tech Jose Munoz, DOE Thomas Sterling, Cal Tech Rick Stevens, ANL Steering committee Barney Maccabe, UNM Ron Brightwell, SNL Al Geist, ORNL Terry Jones, LLNL Rusty Lusk, ANL Ron Minnich, LANL

March 4, 2003SOS-71 Forum Activities Web page February 2002 WIMPS Bodega Bay ½ day FAST-OS workshop Primary focus on organization July ½ days in Chicago 30 people from labs, industry, and academics Report focuses on issues October 2002 ½ day workshop in Santa Fe Commodity versus Specialized approaches

March 4, 2003SOS-71 Issues (1 of 4) Fault tolerance is critical System size and application run times imply that apps will encounter faults All levels (including apps) must deal with faults Programming models Need to support a variety of models Models for 100,000 processors Breaking the legacy straight jacket What comes after MPI?

March 4, 2003SOS-71 Issues (2 of 4) OS Structure Protection boundaries Global/Local OS split How far can we take lightweight approaches Split between OS and runtime system? APIs App/Runtime; Runtime/OS Tools: compilers, debuggers Specific functions Process management and scheduling Security, QoS, Invariants, etc.

March 4, 2003SOS-71 Issues (3 of 4) What is Scalability? Which runtime and OS services should scale Performance nearly independent of size? Reliability nearly independent of size? Future hardware Do we need different OS/runtime for different systems? Hardware support for OS/runtime protection reliable networks collective operations

March 4, 2003SOS-71 Issues (4 of 4) Application requirements Identify critical apps Measure needs Metrics How do we measure success? Programmatic Roles for academics, vendors, and labs Support testbeds (e.g., Chiba City) Don't choose a winner too early in the process

March 4, 2003SOS-71 Next Steps Next general meeting will be May or June Goal is to define the research agenda