Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presented by SciDAC-2 Petascale Data Storage Institute Philip C. Roth Computer Science and Mathematics Future Technologies Group.

Similar presentations


Presentation on theme: "Presented by SciDAC-2 Petascale Data Storage Institute Philip C. Roth Computer Science and Mathematics Future Technologies Group."— Presentation transcript:

1 Presented by SciDAC-2 Petascale Data Storage Institute Philip C. Roth Computer Science and Mathematics Future Technologies Group

2 2 Roth_PDSI_0611  Petascale computing makes petascale demands on storage  Performance  Capacity  Concurrency  Reliability  Availability  Manageability  Parallel file systems are barely keeping pace at terascale; the challenges will be much greater at petascale Cray XT3 Cray X1E The petascale storage problem at ORNL

3 3 Roth_PDSI_0611 Petascale Data Storage Institute  The PDSI is an institute in the Department of Energy (DOE) Office of Science’s Scientific Discovery through Advanced Computing (SciDAC-2) program  Using diverse expertise with applications and file and storage systems, members will collaborate on requirements, standards, algorithms, analysis tools Led by Dr. Garth Gibson, Carnegie Mellon University http://www.pdl.cmu.edu/PDSI

4 4 Roth_PDSI_0611 Carnegie Mellon University Participating Institutions Lawrence Berkeley National Laboratory/NERSC Los Alamos National Laboratory Pacific Northwest National Laboratory Sandia National Laboratories Oak Ridge National Laboratory University of California at Santa Cruz University of Michigan at Ann Arbor

5 5 Roth_PDSI_0611 Novel Storage Mechanisms IT Automation Standards and APIs Community Building Failure Data Collection Performance Data Collection Petascale Data Storage Institute agenda Collection Dissemination Innovation Main ThrustsProjects

6 6 Roth_PDSI_0611 Collection: Performance analysis  Performance data collection and analysis  Workload characterization  Benchmark collection and publication 6 Roth_PDSI_0611 Led by William Kramer, National Energy Research Scientific Computing Center (NERSC)

7 7 Roth_PDSI_0611 0102030405060 0 10 30 40 50 60 70 80 20 Months in production use Failures per month Unknown Human Environment Network Software Hardware http://institutes.lanl.gov/data http://www.pdl.cmu.edu/FailureData Collection: Failure analysis  Capture and analyze failure, error, and usage data from high-end computing systems  Initial example: Los Alamos failure data available for 22 systems over 9 years with extensive analysis by Bianca Schroeder, Carnegie Mellon University Led by Gary Grider, Los Alamos National Laboratory

8 8 Roth_PDSI_0611 Dissemination: Outreach  Our approach  Workshops (SC06 PDSI workshop, November 17!)  Tutorials and course materials  Online, open repository with documents, tools, performance and failure data  Target audience  Computational scientists  Academia (professors and students)  Industry (storage researchers and developers) Led by Dr. Garth Gibson, Carnegie Mellon University Goal: to disseminate information about techniques, mechanisms, best practices, and available tools

9 9 Roth_PDSI_0611 Dissemination: Standards and APIs Some work underway  POSIX extensions  E.g., support for weak data and metadata consistency  http://www.pdl.cmu.edu/POSIX  Parallel Network File System (pNFS)  In IETF NFSv4.1 standard draft  University of Michigan Center for Information Technology Integration producing reference implementation  http://www.pdl.cmu.edu/pNFS Led by Gary Grider, Los Alamos National Laboratory Goals: to facilitate standards development and deployment and to validate and demonstrate new extensions and protocols

10 10 Roth_PDSI_0611 Innovation IT automation applied to high-end computing systems and problems Novel mechanisms for core high-end computing storage problems  Storage system instrumentation for machine learning  Data layout and access planning  Automated diagnosis, tuning, failure recovery  WAN/global storage access  High performance collective operations  Rich metadata at scale  Integration with system virtualization technology Led by Dr. Garth Gibson Carnegie Mellon University Led by Darrell Long University of California at Santa Cruz

11 11 Roth_PDSI_0611 Summary  The Petascale Data Storage Institute brings together individuals with expertise in file and storage systems, applications, and performance analysis  PDSI will be a focal point for computational scientists, academia, and industry for storage-related information and tools, both within and outside SciDAC http://www.pdl.cmu.edu/PDSI

12 12 Roth_PDSI_0611 ORNL contact Philip C. Roth Future Technologies Group Computer Science and Mathematics (865) 241-1543 rothpc@ornl.gov 12 Roth_PDSI_0611


Download ppt "Presented by SciDAC-2 Petascale Data Storage Institute Philip C. Roth Computer Science and Mathematics Future Technologies Group."

Similar presentations


Ads by Google