Presentation is loading. Please wait.

Presentation is loading. Please wait.

UCAR CONFIDENTIAL NCAR’s Response to upcoming OCI Solicitations Richard Loft SCD Deputy Director for R&D.

Similar presentations


Presentation on theme: "UCAR CONFIDENTIAL NCAR’s Response to upcoming OCI Solicitations Richard Loft SCD Deputy Director for R&D."— Presentation transcript:

1 UCAR CONFIDENTIAL NCAR’s Response to upcoming OCI Solicitations Richard Loft SCD Deputy Director for R&D

2 UCAR CONFIDENTIAL Outline l NSF Cyberinfrastructure Strategy (Track-1 & Track-2) l NCAR generic strategy for NSFXX-625’s (Track-2) l NCAR response to NSF05-625 l NSF Petascale Initiative Strategy l NCAR response to NSF Petascale Initiative

3 UCAR CONFIDENTIAL NSF’s Cyberinfrastructure Strategy NSF’s Cyberinfrastructure Strategy l The NSF’s HPC acquisition strategy (through FY10) for HPC is for three Tracks: –Track 1: High End O(1 PFLOPS sustained) –Track 2: Mid level system O(100 TFLOPS) NSFXX-625 n First instance (NSF05-625) submitted Feb 10, 2006 n Next instances due: – –November 30, 2006 – –November 30, 2007 – –November 30, 2008 –Track 3: Typical University HPC O(1-10 TFLOPS) l The purpose of the Track-1 system will be to achieve revolutionary advancement and breakthroughs in science and engineering.

4 UCAR CONFIDENTIAL Solicitation NSF05-625: Towards a Petascale Computing Environment for Science and Engineering l Award: September 2006 l System in production by May 31, 2007 l $30,000,000 or $15,000,000. l Operating costs funded under separate action. l RP serves the broad science community - open access. l Allocations by LRAC/MRAC or “their successors” l Two 10 Gb/s TeraGrid links

5 UCAR CONFIDENTIAL NCAR’s Overall NSFXX-625 Strategy l Leverage NCAR/SCD expertise in production HPC. l Get a production system - –No white box Linux solutions. –Stay on path to usable petascale systems l NCAR is a Teragrid outsider - must address two areas: –Leverage experience with general scientific users –Lack of Grid consulting experience –Emphasize, but don’t over emphasize, geosciences. l In proposing, NCAR has a facility problem –Minimize costs - power, administrative staff, level of support. l Creative plan for remote user support and education.

6 UCAR CONFIDENTIAL NSF05-625 Partners l Facility Partner l End-to-End System Supplier l User Support Network - –NCAR Consulting Service Group –University partners

7 UCAR CONFIDENTIAL NSF05-625 Facility Partner l NCAR ML Facility after ICESS is FULL. l Key Points: –A new datacenter is needed whether NCAR wins the NSF05-625 solicitation or not. –Because of the short timeline, new datacenter never factors into the strategy for NSFXX-625. l Identified a colocation facility l facility features –local (Denver-Boulder area) –State of the Art, High Availability Center –Currently 4 x 2MW generators of power available –Familiar with large scale deployments –Dark Fibre readily available (good connectivity)

8 UCAR CONFIDENTIAL NSF05-625 Supercomputer System Details l Two systems: capability + capacity l ~80 Tflops combined l Robotic tape storage system ~12PB

9 UCAR CONFIDENTIAL NCAR NSF05-625 User Support Plan l Largest potential differentiator in proposal - let’s do something unique! l System will be used by the generic scientist -support plan must –Be extensible to other domains than geoscience –Address grid user support l Strategy leverages OSCER-lead IGERT proposal- –Combine teaching of computational science with user support –Embed application support expertise in key institutions –Build education and training materials through university partnerships.

10 UCAR CONFIDENTIAL Track-1 System Background l Source of funds: Presidential Innovation Initiative announced in SOTU. l Performance goal: 1 PFLOPS sustained on “interesting problems”. l Science goal: breakthroughs l Use model: 12 research teams per year using whole system for days or weeks at a time. l Capability system - large everything & fault tolerant. l Single system in one location. l Not a requirement that machine be upgradable.

11 UCAR CONFIDENTIAL Track-1 Project Parameters l Funds: $200M over 4 years, starting FY07 –Single award –Money is for end-to-end system (as in 625) –Not intended to fund facility. –Release of funds tied to meeting hw and sw milestones. l Deployment Stages: –Simulator –Prototype –Petascale system operates: FY10-FY15 l Operations funds FY10-15 funded separately.

12 UCAR CONFIDENTIAL Two Stage Award Process Timeline l Solicitation out: May, 2006 (???) l [ HPCS down-select: June, 2006 ] l Preliminary Proposal due: August, 2006 –Down selection (invitation to 3-4 to write Full Proposal) l Full Proposal due: January, 2007 l Site visits: Spring, 2007 l Award: Sep, 2007

13 UCAR CONFIDENTIAL NSF’s view of the problem l NSF recognizes the facility (power, cooling, space) challenge of this system. l Therefore NSF welcomes collaborative approaches: –University & Federal Lab –University & commercial data center –University & State Government –University consortium l NSF recognizes that applications will need significant modification to run on this system. –User support plan –Expects proposer to discuss needs in this area with experts in key applications areas.

14 UCAR CONFIDENTIAL The Cards in NCAR’s Hand l NCAR … –Is a leader in making the case that geoscience grand challenge problems need petascale computing. –Has many grand challenge problems to offer itself. –Has experience at large processor counts. –Has recently connected to the TeraGrid, and is moving towards becoming a full-fledged Resource Provider.

15 UCAR CONFIDENTIAL NCAR Response Options l Do Nothing l Focus on Petascale Geoscience Applications l Partner with a lead institution or consortium l Lead a Tier-1 proposal

16 UCAR CONFIDENTIAL NCAR Response Options l Do Nothing l Focus on Petascale Geoscience Applications l Partner with a lead institution or consortium l Lead a Tier-1 proposal

17 UCAR CONFIDENTIAL Questions, Comments?

18 UCAR CONFIDENTIAL The Relationship Between OCI’s Roadmap and NCAR’s Datacenter project Richard Loft SCD Deputy Director for R&D

19 UCAR CONFIDENTIAL Projected CCSM Computing Requirements Exceed Moore’s Law Thanks to Jeff Kiehl/Bill Collins

20 UCAR CONFIDENTIAL NSF’s Cyberinfrastructure Strategy NSF’s Cyberinfrastructure Strategy l The NSF’s HPC acquisition strategy (through FY10) for HPC is for three Tracks: –Track 1: High End O(1 PFLOPS sustained) –Track 2: Mid level system O(100 TFLOPS) NSFXX-625 n First instance (NSF05-625) submitted Feb 10, 2006 n Next instances due: – –November 30, 2006 – –November 30, 2007 – –November 30, 2008 –Track 3: Typical University HPC O(1-10 TFLOPS) l The purpose of the Track-1 system will be to achieve revolutionary advancement and breakthroughs in science and engineering.

21 UCAR CONFIDENTIAL NCAR strategic goals: l NCAR will stay in the top echelon of geoscience computing centers. l NCAR’s immediate strategic goal is to be a Track-2 center. l To do this, NCAR must be integrated with NSF’s cyberinfrastructure plans. l This means both connecting and ultimately operating within the Teragrid framework. l The Teragrid is evolving, so this is a moving target.

22 UCAR CONFIDENTIAL NCAR new-facility l NCAR ML Facility after ICESS is FULL. l Key Points: –A new datacenter is needed whether NCAR wins the NSF05-625 solicitation or not. –Because of the short timeline, a new datacenter never factors into the strategy for NSFXX-625. –Right now, we can’t handle a modest budget augmentation for computing with the current facility.

23 UCAR CONFIDENTIAL Mesa Lab is full after the ICESS procurement l ICESS = Integrated Computing Environment for Scientific Simulation l We’re sitting at 980 kW right now. l Deinstall of bluesky will give us back 450 kW. l This leaves about 600 kW of head-room. l The ICESS procurement is expected to deliver a system with a maximum power requirement of 500- 600 kW of power. l This is not enough to house $15M-$30M of equipment from NSF05-625, for example.

24 UCAR CONFIDENTIAL Max power at the Mesa Lab is 1.2 MW! We’re fast running out of power…

25 UCAR CONFIDENTIAL Preparing for the Petascale Richard Loft SCD Deputy Director for R&D

26 UCAR CONFIDENTIAL What to expect in HEC? l Much more parallelism. l A good deal of uncertainty regarding node architectures. –Many threads per node. l Continued ubiquity of Linux/Intel systems. l There will be vector systems l Emergence of exotic architectures. l Largest (petascale) system likely to have special features –Power aware design (small memory?) –Fault tolerant design features –Light-weight compute node kernels –Custom networks

27 UCAR CONFIDENTIAL Top 500: Speed of Supercomputers vs Time

28 UCAR CONFIDENTIAL Top 500: Number of Processors vs Time

29 UCAR CONFIDENTIAL HEC in 2010 l Based on history, should expect 4K-8K CPU systems to be commonplace by the end of the decade. l The largest systems on the Top500 list should be 1-10 PFLOPS. l Parallelism in largest system - estimate (2010). –Assume a clock speed of 5 GHz a double FMA CPU delivers 20 GFLOPS peak –1 PFLOPS peak = 50K CPU’s. –10 PFLOPS peak = 500K CPU’s –Large vector systems (if they exist) will still be highly parallel. –To justifying using the largest systems, must use a sizable fraction of the resource.

30 UCAR CONFIDENTIAL Range of Plausible Architectures: 2010 Power issues will slow rate of increase in clock frequency. This will drive trend towards massive parallelism. All scalar system with have multiple CPU’s per socket (chip). Currently 2 CPU’s per core, by 2008, 4 CPU’s per socket will be common place. 2010 scalar architectures will likely continue this trend. 8 CPU’s are possible - Cell Chip already has 8 synergistic processors. Key unknown is which architecture for a cluster on a chip will be most effective. Vector systems will be around, but at what price? Wildcards – –Impact of DARPA HPCS program – –Exotics: FPGA’s, PIM’s, GPU’s.

31 UCAR CONFIDENTIAL How to make science staff aware of coming changes? l NCAR must develop a science driven plan for exploiting petascale systems at the end of the decade. l Briefed NCAR Director, DD, CISL and ESSL Directors l Meetings (SEWG at CCSM Breckenridge) l Organizing NSF workshops on petascale geoscience benchmarking scheduled at DC (June 1-2) and NCAR (TBD) l Have initiated internal petascale discussions –CGD-SCD joint meetings –Peta_ccsm mail list. –Peta_ccsm Swiki site. l Through activities like this. NSA should take leadership role.

32 UCAR CONFIDENTIAL What must be done to secure resources to improve scalability? l Must help ourselves. –Invest judiciously in computational science where possible. –Leverage application development partnerships (SciDAC, etc.) l Write proposals. –Support for applications development for the Track-1 system can be built into a NCAR partnership deal. –NSF has indicated an independent funding track for applications. NCAR should aggressively pursue those funding sources. l New ideas can help - e.g. POP

33 UCAR CONFIDENTIAL POP Space Filling Curves: partition for 8 processors Credit: John Dennis, SCD

34 UCAR CONFIDENTIAL POP 1/10 Degree BG/L Improvements

35 UCAR CONFIDENTIAL POP 1/10 Degree performance BG/L SFC improvement

36 UCAR CONFIDENTIAL Questions, Comments?

37 UCAR CONFIDENTIAL Top 500 Processor Types: Intel taking over Today Intel is inside 2/3 of the Top500 machines

38 UCAR CONFIDENTIAL

39 The commodity onslaught … l The Linux/Intel cluster is taking over Top500. l Linux has not penetrated at major Weather, Ocean, Climate centers- yet - reasons –System maturity (SCD experience) –Scalability of dominant commodity interconnects –Combinatorics (Linux flavor, processor, interconnect, compiler) l But it affects NCAR indirectly because… –Ubiquity = Opportunity –Universities are deploying them. –NCAR must rethink services provided to the Universities. –Puts strain on all community software development activities.


Download ppt "UCAR CONFIDENTIAL NCAR’s Response to upcoming OCI Solicitations Richard Loft SCD Deputy Director for R&D."

Similar presentations


Ads by Google