Presentation on theme: "Review of NCAR Al Kellie SCD Director November 01, 2001."— Presentation transcript:
Review of NCAR Al Kellie SCD Director November 01, 2001
Outline of Presentation Introduction to UCAR NCAR SCD Overview of divisional activities Research data sets (Worley) Mass Storage System (Harano) Extracting model performance (Hammond) Visualization & Earth System GRiD (Middleton) Computing RFP (ARCS)
Outline of Presentation INTRODUCTION Overview of three divisional aspects Computing RFP (ARCS)
University Corporation for Atmospheric Research NCAR Tim Killeen, Director Scientific Computing Division (SCD) President Richard Anthes Al Kellie Member Institutions Board of Trustees Finance & Administration Katy Schmoll, VP Corporate Affairs Jack Fellows, VP UCAR Programs Jack Fellows, Director Constellation Observing System for Meteorology Ionosphere Climate (COSMIC) Cooperative Program for Optional Meteorology Education and Training (COMET) GPS Science and Technology Program (GST) Unidata Visiting Scientists Programs (VSP) Environmental & Societal Impacts Group (ESIG) Mesoscale & Microscale Meteorological Division (MMM) Research Applications Programs (RAP) Joint Office for Science Support (JOSS) Information Infrastructure Technology & Applications (IITA) Timothy Spangler Bill Kuo Mary Marlino Robert Harriss Robert Gall Brant Foote Randolph Ware David Fulker Meg Austin Karyn Sawyer Atmospheric Chemistry Division (ACD) Atmospheric Technology Division (ATD) Advanced Study Program (ASP) Climate & Global Dynamics Diviion (CGD) Maurice Blackmon Al Cooper David Carlson Daniel McKenna Richard Chinman Denotes President’s Office 12/07/98 Digital Library for Earth System Science (DLESE) Michael Knölker High Altitude Observatory (HAO)
Atmospheric Chemistry Dan McKenna Atmospheric Technology Dave Carlson Climate & Global Dynamics Maurice Blackmon Mesoscale & Microscale Meteorology Bob Gall High Altitude Observatory Michael Knolker Research Applications Brant Foote Scientific Computing Al Kellie NCAR Tim Killeen UCAR Rick Anthes UCAR Board of Trustees ESIG Bob Harriss ASP Al Cooper Associate Director Steve Dickson ISS K. Kelly B&P R.Brasher NCAR Organization
NCAR at a Glance 41 years; 850 Staff – 135 Scientists $128M budget for FY2001 9 divisions and programs Research tools, facilities, and visitor programs for the NSF and university communities
Where did SCD come from? “Blue Book” 1959 “Blue Book” Link “There are four compelling reasons for establishing a National Institute for Atmospheric Research” 2. The requirement for facilities and technological assistance beyond those that can properly be made available at individual universities
SCD Mission Enable the best atmospheric & related research, no matter where the investigator is located through the provision of high performance computing technologies and related services
Supercomputer Systems Mass Storage Systems High Performance Systems Gene Harano (13) SCIENTIFIC COMPUTING DIVISION Computational Science Steve Hammond (8) Algorithmic Software Development Model performance Research Science Collaboration Frameworks Standards & Benchmarking Data Support Roy Jenne (9) Data Archives Data Catalogs User Assistance Operations and Infrastructure Support Aaron Andersen (18) Operations Room Facility Management & Reporting Database Applications Site Licenses LAN MAN WAN Dial-up Access Network Infrastructure Ginger Caldwell (21) Training/Outreach/Consulting Digital Information Distributed Servers & Workstations Allocations & Account Management User Support Section Networking Engineering & Telecommunications Marla Meehl (25) DIRECTOR’S OFFICE Al Kellie, Director (12) Visualization & Enabling Technologies Don Middleton (12) Data Access Data Analysis Visualization Base $24,874 Ucar $4,027 Outside $2,020 Overhead $1,063
Computing Services for Research Operates two distinct computational facilities. –Climate simulations –University community Governance of these SCD resources in the hands of the users - two external allocation committees. Computing leverages a common infrastructure for access, networking, data storage & analysis, research data sets, and support services including software development, and consulting.
Climate Simulation Laboratory (CSL) The CSL is a national, multi-agency, special-use, computing facility for climate system modeling for the U.S. Global Change Research Program (USGCRP). – Priority projects that require very large amounts of computer time. CSL resources are available to U.S. individual researchers with a preference for research teams regardless of sponsorship. An inter-agency panel selects the projects that use the CSL.
Community Facility The Community Facility is used primarily by university-based NSF grantees and NCAR Scientists. – Community resources are allocated evenly between NCAR and the university community. NCAR resources are allocated by the NCAR Director to the various NCAR divisions. University resources are allocated by the SCD Advisory Panel. Open to areas of atmospheric and related sciences.
History of Supercomputing at NCAR 196019701980199019952000 CDC 3600 CDC 6600 CDC 7600 Cray 1-A S/N 3 Cray Y-MP/2 Cray 1-A S/N 14 TMC CM2/8192 Cray X-MP/4 Cray Y-MP/8 Cray C90/16 Cray T3D/64 TMC CM5/32 IBM RS/6000 Cluster IBM SP1/8 CCC Cray 3/4 1999 Cray Y-MP/8I Cray T3D/128 Cray J90/16 Cray J90/20 Cray J90se/24 HP SPP-2000/64 SGI Origin2000/128 Beowulf/16 IBM SP/64 IBM SP/604 Compaq ES40/36 Cluster IBM SP/32 IBM SP/296 Non-Production Machines Production Machines Currently in Production IBM SP/1308 2001
NCAR Wide Area Connectivity OC3 (155Mbps) to the Front Range GigaPop - OC12 (622Mbps) on 1/1/2002 –OC3 to AT&T Commodity Internet –OC3 to C&W Commodity Internet –OC3 to Abilene (OC12 on 1/1/2002) OC3 to the vBNS+ OC12 (622Mbps) to University of Colorado at Boulder –intra-site research and back-up link to FRGP OC12 to NOAA/NIST in Boulder –Intra-site research and UUNET Commodity Internet Dark fiber metropolitan area network at GigE (1000Mbps) to other NCAR campus sites
TeraGrid Wide Area Network NCSA/UIUC ANL UIC Multiple Carrier Hubs Starlight / NW Univ Ill Inst of Tech Univ of Chicago Indianapolis (Abilene NOC) I-WIRE StarLight International Optical Peering Point (see www.startap.net) Los Angeles San Diego DTF Backbone Abilene Chicago Indianapolis Urbana OC-48 (2.5 Gb/s, Abilene) Multiple 10 GbE (Qwest) Multiple 10 GbE (I-WIRE Dark Fiber) Solid lines in place and/or available by October 2001 Dashed I-WIRE lines planned for summer 2002 * DENVER
ARCS RFP Overview BEST VALUE PROCUREMENT –Technical evaluation –Delivery schedule –Production disruption –Allocation ready state –Infrastructure –Maintenance –Cost impact – i.e. existing equipment –Past performance of bidders –Business proposal review –Other considerations - invitation to partner
ARCS Procurement Production-level –Availability, robust batch capacity, operational sustainability and support –Integrated software engineering and development environment High performance execution of existing applications Additionally – environment conducive to development of next-generation models
Workload profile context Jobs using > 32 nodes –0.4 % of workload –Average 44 nodes or 176 pes Jobs using < 32 nodes –99.6 % of workload –Average 6 nodes or 24 pes
ARCS – The Goal A production-level, high-performance computing system providing for both capability and capacity computing A stable and upwardly compatible system architecture, user environment, and software engineering & development environments Initial equipment: At least double current capacity at NCAR Long Term: Achieve 1 TFLOPs sustained by 2005
ARCS – The Process SCD began technical requirements draft Feb 2000 RFP process (including scientific reps from NCAR divisions, UCAR Contracts, & external review panel) formally began Mar 2000; RFP released Nov 2000 Offeror proposal reviews, BAFOs, & Supplemental proposals Jan-May 2001 Technical Evaluations, Performance projections, Risk Assessment, etc. Feb-Jun 2001 SCD Recommendation for Negotiations 21 Jun; NCAR/ UCAR acceptance of recommendation 25 Jun Negotiations 24-26 Jul; tech. Ts&Cs completed 14 Aug Contract submitted to the NSF 01 Oct NSF Approval 5 Oct … Joint Press Release week SC01
ARCS RFP Technical Attributes Hardware (processors, nodes, memory, disk, interconnect, network, HIPPI) Software (OS, user environment, filesystems, batch subsystem) System admin., resource mgmt., user limits, accounting, network/HIPPI, security Documentation & training System maintenance & support services Facilities (power, cooling, space)
Major Requirements Critical Resource ratios: –Disk6 Bytes/peak-FLOP: 64+ MB/sec single-stream & 2+ GB/sec bandwidth - sustainable –Memory0.4 Bytes/peak-FLOP “Full-featured” product set (cluster-aware compilers, debuggers, performance tools, administrative tools, monitoring) Hardware & Software stability Hardware & Software vendor support & responsiveness (on-site, call center, development organization, escalation procedures) Resource allocation (processor(s), node(s), memory, disk; user limits & disk quotas) Batch Subsystem and NCAR job scheduler (BPS)
Past Performance Hardware & Software –SCD/NCAR experience –Other customers’ experience “Missed Promises” –Vendor X ~ 2 yr slip, product line changes –Vendor Y ~ on target –Vendor Z ~ 1.5 yr slip, product line changes
Other Considerations “Blue Light” project invitation to develop of models for an exploratory supercomputer –Invitation to a partnership development. –Offer for an industrial partnership 256 Tflops peak, 8TB mem, 200TB disk on 64k nodes. True MPP with Torus interconnect. Node-64 Gflops, 128 MB mem, 32 kB L1 cache, 4MB L2 cache –Columbia, LLNL, SDSC, Oak Ridge
ARCS Award IBM was chosen to supply the NCAR Advanced Research Computing System (ARCS) … … will exceed the articulated purpose and goals A world-class system to provide reliable production supercomputing to the NCAR Community and Climate Simulation Laboratory A phased introduction of new, state-of-the-art computational, storage and communications technologies through the life of the contract (3-5 years) First equipment delivered Friday, 5 October
ARCS Capacities DateSystemTotal Disk Capacity (TB) Total Memory (TB) Peak TFLOPs New (Total) 3-Year Contract Oct 2001blackforest upgrade10.50.751.1 (2.0) Sep 2002bluesky with Colony Switch 332.85.81+ (6.81+) Sep-Dec 2003Federation Switch Upgrade 2-Year Extension Option Sep-Dec 2004bluesky upgrade653.88.75+ (8.75+) + Negotiated capability commitments may require installation of additional capacity. Minimum
ARCS Commitments Minimum Model Capability Commitments –blackforest upgrade1.0x (defines ‘x’) –bluesky3.1x –bluesky upgrade4.6x Failure to meet these commitments will result in IBM installing additional computational capacity Improved user environment functionality, support and problem resolution response Early access to new hardware & software technologies NCAR’s participation in IBM’s “Blue Light” exploratory supercomputer project (PFLOPs)
Your consent to our cookies if you continue to use this website.