Cyberinfrastructure for the 21st Century (CIF21): Data MRI and STCI

Slides:



Advertisements
Similar presentations
21 st Century Science and Education for Global Economic Competition William Y.B. Chang Director, NSF Beijing Office NATIONAL SCIENCE FOUNDATION.
Advertisements

Supporting Research on Campus - Using Cyberinfrastructure (CI) Public research use of ICT has rapidly increased in the past decade, requiring high performance.
Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
U.S. Department of Energy’s Office of Science Basic Energy Sciences Advisory Committee Dr. Daniel A. Hitchcock October 21, 2003
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
GENI: Global Environment for Networking Innovations Larry Landweber Senior Advisor NSF:CISE Joint Techs Madison, WI July 17, 2006.
EInfrastructures (Internet and Grids) US Resource Centers Perspective: implementation and execution challenges Alan Blatecky Executive Director SDSC.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
The NSF Cyberinfrastructure for the 21 st Century Program CIF21 Rob Pennington Program Director Office of Cyberinfrastructure National Science Foundation.
The "Earth Cube” Towards a National Data Infrastructure for Earth System Science Presentation at WebEx Meeting July 11, 2011.
The Vision, Process, and Requirements for Creating EarthCube Presentation at Second EarthCube WebEx Aug 22, 2011.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation, Integration.
Oceans Observations Environmental Obs Satellites Earth System Modeling Cyberinfrastructure in an Era of Observation and Simulation EarthScopeWater Eva.
April 2009 OSG Grid School - RDU 1 Open Science Grid John McGee – Renaissance Computing Institute University of North Carolina, Chapel.
1 CASC September Meeting Planning for CIF21 New Computational Infrastructure: CDS&E Software HPC Gabrielle Allen, Eduardo Misawa, Manish Parashar Irene.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Advances in Cyberinfrastructure with a Focus on Data: a U.S. National Science Foundation Overview Alliance for Permanent Access to Records of Science in.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
Research Cyberinfrastructure Alliance Working in partnership to enable computationally intensive, innovative, interdisciplinary research for the 21 st.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
Unidata Policy Committee Meeting Bernard M. Grant, Assistant Program Coordinator for the Atmospheric and Geospace Sciences Division May 2012 NSF.
Designing the Microbial Research Commons: An International Symposium Overview National Academy of Sciences Washington, DC October 8-9, 2009 Cathy H. Wu.
EarthCube Vision An alternative approach to respond to daunting science and CI challenges An alternative approach to respond to daunting science and CI.
Imagine a World…. With easy, unlimited access to scientific data from any field Where you can easily plot data of interest and display it any way you want.
Sharing Research Data Globally Alan Blatecky National Science Foundation Board on Research Data and Information.
Campus Cyberinfrastructure – Network Infrastructure and Engineering (CC-NIE) Kevin Thompson NSF Office of CyberInfrastructure April 25, 2012.
Cyberinfrastructure Planning at NSF Deborah L. Crawford Acting Director, Office of Cyberinfrastructure HPC Acquisition Models September 9, 2005.
Research Recommendations for the Broadband Taskforce Agenda November 23, 2009.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
1 Investing in America’s Future The National Science Foundation Strategic Plan for FY Advisory Committee for Cyberinfrastructure 10/31/06 Craig.
ESIP Federation Air Quality Cluster Partner Agencies.
Perspectives on Cyberinfrastructure Daniel E. Atkins Professor, University of Michigan School of Information & Dept. of EECS October 2002.
© Internet 2012 Internet2 and Global Collaboration APAN 33 Chiang Mai 14 February 2012 Stephen Wolff Internet2.
Russ Hobby Program Manager Internet2 Cyberinfrastructure Architect UC Davis.
David Mogk Dept. of Earth Sciences Montana State University April 8, 2015 Webinar SAGE/GAGE FACILITIES SUPPORTING BROADER EDUCATIONAL IMPACTS: SOME CONTEXTS.
Geosciences - Observations (Bob Wilhelmson) The geosciences in NSF’s world consists of atmospheric science, ocean science, and earth science Many of the.
National Center for Supercomputing Applications Barbara S. Minsker, Ph.D. Associate Professor National Center for Supercomputing Applications and Department.
08/05/06 Slide # -1 CCI Workshop Snowmass, CO CCI Roadmap Discussion Jim Bottum and Patrick Dreher Building the Campus Cyberinfrastructure Roadmap Campus.
HPC Centres and Strategies for Advancing Computational Science in Academic Institutions Organisers: Dan Katz – University of Chicago Gabrielle Allen –
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
Middleware Camp NMI (NSF Middleware Initiative) Program Director Alan Blatecky Advanced Networking Infrastructure and Research.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
1 Investing in America’s Future The National Science Foundation Strategic Plan for FY OPP Advisory Committee 10/26/06.
ARL Workshop on New Collaborative Relationships: The Role of Academic Libraries in the Digital Data Universe September 26-27, 2006 ARL Prue.
Implementing a National Data Infrastructure: Opportunities for the BIO Community Peter McCartney Program Director Division of Biological Infrastructure.
Chaitan Baru Senior Advisor for Data Science CISE Directorate National Science Foundation NIEHS Webinar October 27, 2015 Image Credit: Exploratorium. Integrating.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
What’s Happening at Internet2 Renee Woodten Frost Associate Director Middleware and Security 8 March 2005.
Digital Data Collections ARL, CNI, CLIR, and DLF Forum October 28, 2005 Washington DC Chris Greer Program Director National Science Foundation.
1 Why is Digital Curation Important for Workforce and Economic Development? Alan Blatecky Office of Cyberinfrastructure Symposium on Digital Curation in.
Preliminary Findings Baseline Assessment of Scientists’ Data Sharing Practices Carol Tenopir, University of Tennessee
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
Forging the eXtremeDigital (XD) Program Barry I. Schneider Program Director, Office of CyberInfrastructure January 20, 2011.
Internet2 Strategic Directions October Fundamental Questions  What does higher education (and the rest of the world) require from the Internet.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
NSF Draft Strategic Plan for Data, Data Analysis, and Visualization Chris Greer Program Director National Science Foundation.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
Capacity Building in: GEO Strategic Plan 2016 – 2025 and Work Programme 2016 Andiswa Mlisa GEO Secretariat Workshop on Capacity Building and Developing.
EarthCube Sustaining the Geosciences for 21 st Century Challenges Credits: from top to bottom: NOAA Okeanos Explorer Program (CC BY-SA 2.0), NASA/Kathryn.
1 Cyberinfrastructure for the 21 st Century (CIF21) NSF Data Strategy and EarthCube 9 th e-Infrastructure Concertation Meeting Sept 23, 2011 Rob Pennington.
Workshop on Cyberinfrastructure National Science Foundation
Three Uses for a Technology Roadmap
Briefing to ARL Membership
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Bird of Feather Session
Presentation transcript:

Cyberinfrastructure for the 21st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National Science Foundation rpenning@nsf.gov 1

Framing the Challenge: Science and Society Transformed by Data Modern science Data- and compute-intensive Integrative, multiscale Multi-disciplinary Collaborations for Complexity Individuals, groups, teams, communities Sea of Data Age of Observation Distributed, central repositories, sensor- driven, diverse, etc

Advisory Committee for Cyberinfrastructure Task Force Reports More than 25 workshops and Birds of a Feather sessions and more than 1300 people involved Final recommendations presented to the NSF Advisory Committee on Cyberinfrastructure (ACCI) Dec 2010 Final reports on-line at: http://www.nsf.gov/od/oci/taskforces/ Campus Bridging Data and Viz HPC HIGH P ERFORMANCE COMPUTING Grand Challenges Cyberlearning Software

Data Task Force Recommendations Infrastructure: Recognize data infrastructure and services (including visualization) as essential long term research assets fundamental to today’s science Economic sustainability: Develop realistic cost models to underpin institutional/national business plans for research repositories/data services Culture Change: Emphasize expectations for data sharing; support the establishment of new citation models in which data and software tool providers and developers are recognized and credited with their contributions Data Management Guidelines: Identify and share best-practices for the critical areas of data management Ethics and IP: Train researchers in privacy-preserving data access

Evolution of Cyberinfrastructure for the 21st Century (CIF21) and Data National Science Board (NSB) On-going input Science & Engineering Research + Cyberinfrastructure ACCI Data Task Force NSF CIF21 Data Programs DataNet Program Community Input

Cyberinfrastructure Ecosystem (CIF21) Organizations Universities, schools Government labs, agencies Research and Medical Centers Libraries, Museums Virtual Organizations Communities Expertise Research and Scholarship Education Learning and Workforce Development Interoperability and operations Cyberscience Scientific Instruments Large Facilities, MREFCs,,telescopes Colliders, shake Tables Sensor Arrays - Ocean, environment, weather, buildings, climate. etc Discovery Collaboration Education Data Databases, Data repositories Collections and Libraries Data Access; storage, navigation management, mining tools, curation, privacy Computational Resources Supercomputers Clouds, Grids, Clusters Visualization Compute services Data Centers Networking Campus, national, international networks Research and experimental networks End-to-end throughput Cybersecurity Software Applications, middleware Software development and support Cybersecurity: access, authorization, authentication Maintainability, sustainability, and extensibility

CIF21: Four Major Thrust Areas Organizations Universities, schools Government labs, agencies Research and Medical Centers Libraries, Museums Virtual Organizations Communities Community Research Networks Expertise Research and Scholarship Education Learning and Workforce Development Interoperability and operations Cyberscience Scientific Instruments Large Facilities, MREFCs,,telescopes Colliders, shake Tables Sensor Arrays - Ocean, environment, weather, buildings, climate. etc Data-Enabled Science Discovery Collaboration Education Education: integral and embedded Data Databases, Data repositories Collections and Libraries Data Access; storage, navigation management, mining tools, curation, privacy Computational Resources Supercomputers Clouds, Grids, Clusters Visualization Compute services Data Centers New Computational Resources Access and Connections to CI Resources Networking Campus, national, international networks Research and experimental networks End-to-end throughput Cybersecurity Software Applications, middleware Software development and support Cybersecurity: access, authorization, authentication

Scientific Data Challenges Square Kilometer Array Climate, Environment Exa Bytes Peta Tera Giga Volume Genomics Bytes per day Useful Lifetime Climate, Environment TeraGrid, Blue Waters LHC LHC LSST DataNet Distribution Genomics Many smaller datasets… 2012 2020 Data Access

CIF21 Data Goals Support data intensive and multi-disciplinary science Provide reliable digital access, integration, management and preservation capabilities for science and engineering data over a decades-long timeline Develop innovative data analysis and mining tools to support data manipulation, modeling, and discovery Engage at the frontiers of technological innovation and transformative science to drive the leading edge forward

DataNet Role in CIF21 DataNet is a strategic part of Foundation-wide investments in data in CIF21 Focus on center–scale awards DataNet efforts effectively balance: Production infrastructure to provide operational services Research to create next generation infrastructure DataNet awards are partnerships Responsive to user communities to define their meaningful and useful scope Form a coordinated network to provide national, interdisciplinary data models and infrastructure

DataNet: A Multi-tiered and Multi-Disciplinary Landscape Modeling and Simulation Communities Population, Climate, Environment Communities Data-enabled Science Genomics Communities Data Curation Data Storage DataNet supported

Data Storage National storage infrastructure for scientific data Accommodate scale and heterogeneity of scientific data through robust, open, and broadly accepted standards Sustainable cost model that can be implemented with governmental, academic, non profit, and commercial stakeholders such that it is sustainable. Make strategic investments that: Leverage existing resources in TeraGrid, commercial clouds, federal data centers Meet growing capacity needs at optimum cost Provide coordinating and integrative functions for integrity, access control, availability, persistence Catalyze a national data infrastructure in a similar role that NSFNet played in Internet

Data Curation Sustainable, community-based networks for management of critical scientific data resources in a life-cycle context. Overcome challenges of culture change, policy development and implementation, sustainable operations, quality and usability control. Strategic awards that address heterogeneity in formats, complexity, semantics of data collections that are valued by science communities of significant breadth. Operate as a network of data services that promote interoperability, multidisciplinarity, and scalability.

Data Enabled Science Provide critical tools and services for data mining, integration, analysis, modeling and visualization. Overcome barriers to scaling, synthesis, and interoperability to promote effective use of large scale, shared data resources. Strategic investments that concentrate tools, resources and expertise in support of compelling grand challenge science questions.

Cross Cutting Challenges Balancing research into next generations of infrastructure with operation & maintenance of current capacity. Stimulate innovation and manage transitions Sustainable, long term programs Technical design, development of business models, and integration with the research cycle. Integration Vertical – Linking low-level bit storage infrastructure to data collections, and finally to applications Horizontal– Achieving connectivity and interoperability between activities that vary in scale, disciplinarity, and funding source.

DataNet Program Management Life cycle perspective covering the use of the data Research, development, implementation, operations, sustainability, close-out Apply project management methods WBS, risk management, change control, schedule, milestones, deliverables Standardized process: Evaluate science merit, conceptual design Develop draft PEP, design and reporting metrics. Critical review – prototype, finalize baseline (approval/mid-course correction/off-ramp) Implementation & operations – subject to change control, oversight based on milestones & metrics Final operational review – informs decision for renewal, termination.

DataNet Federation Consortium Data Driven Science Implement national data grid Federate existing discipline-specific data management systems to enable national research collaborations Enable collaborative research on shared data collections Manage collection life cycle as the user community broadens Integrate “live” research data into education initiatives Enable student research participation through control policies Project Shared Collection Processing Pipeline Digital Library Reference Collection Federation Collection Life Cycle Science and Engineering Initiatives: Ocean Observatories Initiative the iPlant Collaborative CUAHSI CIBER-U Odum Social Science Institute Temporal Dynamics of Learning Center Cyber-infrastructure Partners: Univ. of North Carolina, Chapel Hill Univ. of California, San Diego Arizona State University Drexel University Duke University University of Arizona University of South Carolina Policy-based data management National Science Foundation Cooperative Agreement: OCI-0940841

MRI 2011 CUNY SI: Instrumentation for Enabling Data Analysis, Sharing, Storage, and Preservation UC Boulder: Acquisition of a Scalable Petascale Storage Infrastructure for Data-Collections and Data-Intensive Discovery RPI: Acquisition of a Balanced Environment for Simulation NCA&T: Acquisition of a Complete High-Performance Modeling and Visualization System for Research in Mathematical Biology and Mathematical Geosciences OSU: Acquisition of a High Performance Compute Cluster for Multidisciplinary Research

What is EarthCube?

A Call to Action Over the next decade, the geosciences community commits to developing a framework to understand and predict responses of the Earth as a system—from the space-atmosphere boundary to the Earth’s core, including the influences of humans and ecosystems Transitions and Tipping Points in Complex Environmental Systems, NSF AC for Environmental Research and Education, 2009 Earth Science and Applications from Space: National Imperatives for the Next Decade and Beyond, 2007 High-Performance Computing Requirements for the Computational Solid Earth Sciences, 2005

Goal of EarthCube To transform the conduct of research in geosciences by supporting community-based cyberinfrastructure to integrate data and information for knowledge management across the Geosciences.

What Needs To Be Done? Integrate data, tools and communities through cyberinfrastructure Establish a governance mechanism that is inclusive and adopted by the community Utilize current and emerging technologies to create transparent infrastructure for the geosciences community

Convergence to a Unifying Architecture Modes of Support Well-Connected through EarthCube Loosely or Not Connected This is an iterative process

EARTHCUBE ASSUMPTIONS The geosciences community is ready to take on the EarthCube challenge Community will start self-organizing prior to EarthCube activities, like the Nov 1-4 Charrette Current and emerging technology will help achieve the convergence envisioned for EarthCube A broad range of expertise and resources must be engaged to shape EarthCube

Developed through EAGERs Proposed Framework Approaches Developed through EAGERs DCL Released Two WebEx events Sandpit/ IdeasLab to determine 18 mo. prototype award(s) Charrette ROB Jun 2011 Nov 1-4 2011 May 2012 Jul-Sept 2011 Nov/11-Apr/12

EARTHCUBE TIMELINE Prototype Development: May to December 2013 On-line Community Information: August to November, 2011 EarthCube Charrette: Early November, 2011 EarthCube Ideas/Lab: Tentatively Early May, 2012 Prototype Development: May to December 2013 Fully integrated geosciences infrastructure: 2014-2022

Pre-Charrette Organization (August – September) Second WebEx on Aug. 22 NSF seeks input from wide range of sources Individuals, inst./org., representatives of scientific groups or communities Facilities and managers of CI endeavors Industry, Federal Labs., Federal Agencies, and International Partners NSF will establish on-line resources and forums to Gather community inputs/requirements Facilitate partnerships and collaborations Encourage submission of approaches to the EarthCube design

Charrette Process Stakeholders focus EarthCube Ideas and Activities Plenary Sessions to discuss user requirements refine approaches and designs for EarthCube develop partnerships and new collaborations Remote participation and real-time comments system will be available Summary Session Comments from NSF, facilitators, and participants on process NSF provides guidance on post-Charrette activities

Questions?