UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Dr. Francine Berman Director, San Diego Supercomputer Center Professor and High.

Slides:



Advertisements
Similar presentations
1 Birmingham LIGO Gravitational Wave Observatory Replicate 1 TB/day of data to 10+ international sites Uses GridFTP, RFT, RLS, DRS Cardiff AEI/Golm
Advertisements

National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Research Councils ICT Conference Welcome Malcolm Atkinson Director 17 th May 2004.
SWITCH Visit to NeSC Malcolm Atkinson Director 5 th October 2004.
Jeffrey P. Gardner Pittsburgh Supercomputing Center
UCSD SAN DIEGO SUPERCOMPUTER CENTER 1 Who needs a supercomputer? Professor Snavely, University of California Professor Allan Snavely University of California,
Supercomputing Institute for Advanced Computational Research © 2009 Regents of the University of Minnesota. All rights reserved. The Minnesota Supercomputing.
UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER UC Grid Summit -- April 1, 2009 UC San Diego Campus Grid Update Shava Smallen San Diego.
1 US activities and strategy :NSF Ron Perrott. 2 TeraGrid An instrument that delivers high-end IT resources/services –a computational facility – over.
Cloud Storage in Czech Republic Czech national Cloud Storage and Data Repository project.
SACNAS, Sept 29-Oct 1, 2005, Denver, CO What is Cyberinfrastructure? The Computer Science Perspective Dr. Chaitan Baru Project Director, The Geosciences.
The Frame NSF-funded national supercomputer centers Centers have hosted significant projects: TeraGrid, NPACI, GEON, SCEC, Chronopolis Fostered development.
Background Chronopolis Goals Data Grid supporting a Long-term Preservation Service Data Migration Data Migration to next generation technologies Trust.
The ADAMANT Project: Linking Scientific Workflows and Networks “Adaptive Data-Aware Multi-Domain Application Network Topologies” Ilia Baldine, Charles.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
SDSC Computing the 21st Century Talk Given to the NSF Sugar Panel May 27, 1998.
SAN DIEGO SUPERCOMPUTER CENTER Academic Associates Program (AAP) at San Diego Supercomputer Center (SDSC) Subhashini Sivagnanam AAP Liaison, SDSC.
University of California, San Diego Structural Engineering SAN DIEGO SUPERCOMPUTER CENTER University of Zürich, Switzerland Image Guided Therapy Program.
SAN DIEGO SUPERCOMPUTER CENTER Accounting & Allocation Subhashini Sivagnanam SDSC Special Thanks to Dave Hart.
EInfrastructures (Internet and Grids) - 15 April 2004 Sharing ICT Resources – “Think Globally, Act Locally” A point-of-view from the United States Mary.
University of California, San Diego San Diego Supercomputer Center Computational Radiology Laboratory Brigham & Women’s Hospital, Harvard Medical School.
Unit 3—Part A Computer Memory
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
The Fundamentals of Preserving Knowledge Assets Pacific Neighborhood Consortium 2010 Catherine Quinlan, Dean of the USC Libraries USC's Dual Approach.
Amit Chourasia Visualization Scientist Visualization Services Presented at : Florida State University, Nov 20 th 2006 Scientific Visualization of Large.
Amit Chourasia Visualization Scientist San Diego Supercomputer Center Presented at : Cyberinfrastructure Internship Experiences for Graduate Students Spring.
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
Research Support Services Research Support Services.
Diane Baxter Welcome to the San Diego Supercomputer Center Dr. Diane Baxter Education Director San Diego Supercomputer Center Thanks to Fran Berman, Jeff.
Data R&D Issues for GTL Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego Bertram Ludäscher
SAN DIEGO SUPERCOMPUTER CENTER NUCRI Advisory Board Meeting November 9, 2006 Science Gateways on the TeraGrid Nancy Wilkins-Diehr TeraGrid Area Director.
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
Rensselaer Why not change the world? Rensselaer Why not change the world? 1.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
Astro-WISE & Grid Fokke Dijkstra – Donald Smits Centre for Information Technology Andrey Belikov – OmegaCEN, Kapteyn institute University of Groningen.
Unit 2—Part A Computer Memory Computer Technology (S1 Obj 2-3)
PSCIC Working Group: Parag Chitnis Chris Greer Susan Lolle Sam Scheiner Jane Silverthorne Bill Zamer Manfred Zorn.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Capability Computing – High-End Resources Wayne Pfeiffer Deputy Director NPACI & SDSC NPACI.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky An Introduction to the.
GEOSCIENCE NEEDS & CHALLENGES Dogan Seber San Diego Supercomputer Center University of California, San Diego, USA.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
Minority-Serving Institutions (MSI) Cyberinfrastructure (CI) Institute [MSI-CI 2 ] and CI Empowerment Coalition MSI-CIEC October Geoffrey Fox
© 2010 Pittsburgh Supercomputing Center Pittsburgh Supercomputing Center RP Update July 1, 2010 Bob Stock Associate Director
Visualizing TERASHAKE Amit Chourasia Visualization Scientist Visualization Services San Diego Supercomputer center Geon Visualization Workshop March 1-2,
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
December 10, 2003Slide 1 International Networking and Cyberinfrastructure Douglas Gatchell Program Director International Networking National Science Foundation,
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
Digital Data Collections ARL, CNI, CLIR, and DLF Forum October 28, 2005 Washington DC Chris Greer Program Director National Science Foundation.
SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Engineering Advisory Committee Cyberinfrastructure Subcommittee -- Prologue Dr. Francine Berman Director, SDSC.
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
Visualizing large scale earthquake simulations Amit Chourasia Visualization Scientist San Diego Supercomputer Center Presented to: Advanced User Support,
High throughput biology data management and data intensive computing drivers George Michaels.
PEER 2003 Meeting 03/08/031 Interdisciplinary Framework Major focus areas Structural Representation Fault Systems Earthquake Source Physics Ground Motions.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Introduction to SDSC Fran Berman Director, SDSC and.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
Accessing the VI-SEEM infrastructure
Clouds , Grids and Clusters
Tools and Services Workshop
Joslynn Lee – Data Science Educator
Large Data Visualization of Seismic Data (TeraShake)
ESciDoc Introduction M. Dreyer.
Presentation transcript:

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Dr. Francine Berman Director, San Diego Supercomputer Center Professor and High Performance Computing Endowed Chair, UC San Diego Cyberinfrastructure and California

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD The Digital World Commerce Entertainment Information Science

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Todays Technology is a Team Sport Todays computer is a coordinated set of hardware, software, data, and services providing an end-to- end resource. network DATA computer storage field instrument network computer DATA network computer viz computer sensors Field instrument DATA wireless The computer as an integrated set of resources Cyberinfrastructure captures the integrated character of todays IT environment

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Cyberinfrastructure -- An Integrating Concept Cyberinfrastructure = Resources (computers, data storage, networks, scientific instruments, experts, etc.) + Glue (integrating software, systems, and organizations)

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD How does Cyberinfrastructure Work? Cyberinfrastructure-enabled Neurosurgery PROBLEM: Neuro-surgeons seek to remove as much tumor tissue as possible while minimizing removal of healthy brain tissue Brain deforms during surgery Surgeons must align preoperative brain image with intra-operative images to provide surgeons the best opportunity for intra-surgical navigation Radiologists and neurosurgeons at Brigham and Womens Hospital, Harvard Medical School exploring transmission of 30/40 MB brain images (generated during surgery) to SDSC for analysis and alignment Finite element simulation on biomechanical model for volumetric deformation performed at SDSC; output results are sent to BWH where updated images are shown to surgeons Transmission repeated every hour during 6-8 hour surgery. Transmission and output must take on the order of minutes

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD SDSC National facility funded by NSF, NIH, DOE, Library of Congress, NARA, etc. Employs nearly 400 researchers, staff and students National Facility and UCSD Organized Research Unit Home to many associated activities including Protein Data Bank Biomedical Informatics Research Network (BIRN) Coordinating Center Geosciences Network (GEON) NEES IT Center, etc. SDSC is a National Cyberinfrastructure Center Grid and Cluster Computing Data- oriented Science and Engineering Networking High Performance computing Data and Knowledge Systems Computational Science and Engineering Community Databases and Data Collections SW tools, workbenches, toolkits

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD SDSC Resources Are Available to the Community COMPUTE SYSTEMS DataStar 2,528 Power4+ processors IBM p655 8-way and p way nodes 7 TB total memory Up to 3 GBps I/O to disk TeraGrid Cluster 512 Itanium2 IA-64 processors 1 TB total memory Also way data nodes Blue Gene Data First academic IBM Blue Gene system 2,048 PowerPC processors 128 I/O nodes user_services/ SCIENCE and TECHNOLOGY STAFF, SOFTWARE, SERVICES User Services Application/Community Collaborations Education and Training SDSC Synthesis Center Community SW, toolkits, portals, codes DATA ENVIRONMENT 1.4 PB Storage-area Network (SAN) 6 PB StorageTek tape library HPSS and SAM-QFS archival systems DB2, Oracle, MySQL Storage Resource Broker 72-CPU Sun Fire 15K IBM p690s – HPSS, DB2, etc Support for community data collections and databases Data management, mining, analysis, and preservation

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Cyberinfrastructure Can Help Harness Todays Deluge of Data Over the next decade, data will come from everywhere Scientific instruments Experiments Sensors and sensornets New devices (personal digital devices, computer-enabled clothing, cars, …) And be used by everyone Scientists Consumers Educators General public Cyberinfrastructure must support unprecedented diversity, globalization, integration, scale, and use Data from sensors Data from simulations Data from instruments Data from analysis Volunteer Data

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD How much Data is there?* Kilo 10 3 Mega 10 6 Giga 10 9 Tera Peta Exa human brain at the micron level = 1 PetaByte 1 novel = 1 MegaByte iPod Shuffle (up to 120 songs) = 512 MegaBytes Printed materials in the Library of Congress = 10 TeraBytes SDSC HPSS tape archive = 6 PetaBytes All worldwide information in one year = 2 ExaBytes 1 Low Resolution Photo = 100 KiloBytes * Rough/average estimates

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Cybeirnfrastructure and Data: Using Data for Analysis and Simulation

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Major Earthquakes on the San Andreas Fault, 1680-present 1906 M M 7.7 How dangerous is the southern San Andreas Fault? The SCEC TeraShake simulation is a result of immense effort from the Geoscience community for over 10 years Focus is on understanding big earthquakes and how they will impact sediment-filled basins. Simulation combines massive amounts of data, high-resolution models, large-scale supercomputer runs TeraShake results provide new information enabling better Estimation of seismic risk Emergency preparation, response and planning Design of next generation of earthquake-resistant structures Such simulations provide potentially immense benefits in saving both many lives and billions in economic losses ? Cyberinfrastructure – enabled Disaster Preparedness

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Domain: 600Km x 300km x 80km Mesh Dimension: 3000x1500x400 Spatial resolution = 200m Simulated time = 200s Number of time steps = 20,000 What youre looking at: L.A. experiences strong ground motion from the S->N scenario The N->S rupture generates strong reverberations in the Imperial Valley, ultimately hitting Mexicalli and other northern Mexico cities. Large local peaks in ground motion near Palm Springs, resulting in immense damage.

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Making Terashake Work -- Resources Data Storage 47 TB archival tape storage on Sun StorEdge SAM-QFS 47 TB backup on High Performance Storage system HPSS SRB Collection with 1,000,000 files Funding SDSC Cyberinfrastructure resources for TeraShake funded by NSF Southern California Earthquake Center is an NSF-funded geoscience research and development center Computers and Systems 80,000 hours on 240 processors of DataStar 256 GB memory p690 used for testing, p655s used for production run, TG used for porting 30 TB Global Parallel file GPFS Run-time 100 MB/s data transfer from GPFS to SAM-QFS 27,000 hours post-processing for high resolution rendering People 20+ people involved in information technology support 20+ people involved in geoscience modeling and simulation

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Cyberinfrastructure and Data: Preserving our Scientific and Cultural Heritage

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Data Preservation Many Science, Cultural, and Official Collections must be sustained for the foreseeable future Critical collections must be preserved: community reference data collections (e.g. Protein Data Bank) irreplaceable collections (e.g. Shoah collection) longitudinal data (e.g. PSID – Panel Study of Income Dynamics) No plan for preservation often means that data is lost or damaged ….the progress of science and useful arts … depends on the reliable preservation of knowledge and information for generations to come. Preserving Our Digital Heritage, Library of Congress

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Key Challenges for Digital Preservation What should we preserve? What materials must be rescued? How to plan for preservation of materials by design? How should we preserve it? Formats Storage media Stewardship – who is responsible? Who should pay for preservation? The content generators? The government? The users? Who should have access? Print media provides easy access for long periods of time but is hard to data-mine Digital media is easier to data-mine but requires management of evolution of media and resource planning over time

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Planning Ahead for Preservation Services Policy R&D Ingestion Comprehensive approach to infrastructure for long-term preservation requires the integration of Collection ingestion Access and Services Research and development for new functionality and adaptation to evolving technologies Business model, data policies, and management issues critical to success of the infrastructure Consortium

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Cyberinfrastructure Resources at SDSC

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD SDSC Data Central First program of its kind to support research and community data collections and databases Comprehensive resources Disk: 400 TB accessible via HPC systems, Web, SRB, GridFTP Databases: DB2, Oracle, MySQL SRB: Collection management Tape: 6 PB, accessible via file system, HPSS, Web, SRB, GridFTP Data collection and database hosting Batch oriented access Collection management services Collaboration opportunities: Long-term preservation Data technologies and tools New Allocated Data Collections include Bee Behavior (Behavioral Science) C5 Landscape DB (Art) Molecular Recognition Database (Pharmaceutical Sciences) LIDAR (Geoscience) LUSciD (Astronomy) NEXRAD-IOWA (Earth Science) AMANDA (Physics) SIO_Explorer (Oceanography) Tsunami and Landsat Data (Earthquake Engineering) UC Merced Library Japanese Art Collection (Art) Terabridge (Structural Engineering)

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD SDSC Cyberinfrastructure Resources Heavily Used by UC faculty and students UC PIs account for 329+ trillion bytes of data stored at SDSC In FY05, over 5 million CPU hours on HPC machines at SDSC were used by UC faculty and students at all campuses UCSD faculty make up 40% of among top users of SDSC compute resources SDSC Academic Associates Program Targets Enabling Cyberinfrastructure Collaborations SDSC/UC Academic Associates Program Cyberinfrastructure and Seeding Activities Targeted workshops Priority SW installation and support Priority participation for Cyberinfrastructure Summer Institute Focused assistance with developing successful proposals for national allocation programs Targeted user services Special UC compute and data allocations Priority for early usage of new national resources

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Cyberinfrastructure is Fundamental for California Cyberinfrastructure captures the practice and potential of modern science and engineering Cyberinfrastructure is the focus of increasing number of federal programs NSF (all directorates), NIH (BISTI, Bioinformatics, Computational Biology, etc.), DOE (Science Grid), etc. Cyberinfrastructure is critical for success in modern research and education initiatives Stem cell research Grid computing Multi-disciplinary science and engineering Leadership in Cyberinfrastructure provides a competitive edge to California researchers, educators, practitioners, and business leaders

UNIVERSITY OF CALIFORNIA SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UCSD Thank You