Presentation is loading. Please wait.

Presentation is loading. Please wait.

SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Introduction to SDSC Fran Berman Director, SDSC and.

Similar presentations


Presentation on theme: "SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Introduction to SDSC Fran Berman Director, SDSC and."— Presentation transcript:

1 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Introduction to SDSC Fran Berman Director, SDSC and NPACI Professor and High Performance Computing Endowed Chair CSE Department, UCSD

2 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SDSC projects target the interface of science and technology Some images courtesy of Larry Smarr Astronomy Physics Life Sciences Modeling and Simulation Data Management and Mining GAMESS QCD Geosciences

3 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE An Integrated Approach: MCell Tomographic Reconstruction Shared Input data Tasks Raw Output Post-processing Final Output Electron Microscope MCell as biologists see it MCell as computer scientists see it Feedback Improves model accuracy Models provide data for simulation Biological modeling from physical samples Simulation executed on available resources: supercomputers to lab clusters. MCell code developed to target wide variety of technologies ETF, NPACI Grid

4 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Access to Critical Data The Protein Data Bank Largest repository on the planet for protein information Provides free worldwide public access 24/7 to accurate protein data 150,000 web hits per day to DB Supported at SDSC by networking, web services, IT support, data services, and other technologies Usability engineering a focus of efforts to minimize barrier to access Protein Interactions Electrostatics Systematic Protein Annotation Modeling PlantsP Biomedical Information Research Network Joint Center for Structural Genomics Protein Data Bank Alliance for Cell Signaling Modeling Proteins Proteins form the basis of many bioscience activities at SDSC

5 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SDSC Employs nearly 400 researchers, staff and students Leading edge site for NSF’s National Partnership for Advanced Computational Infrastructure (NPACI) One of 5 sites of NSF’s TeraGrid/ETF project Home of many associated activities including Protein Data Bank Alliance for Cell Signaling High Performance Wireless Research and Education Network (HPWREN) Geosciences Network (GEON) Joint Center for Structural Genomics Protein Kinase Resource, etc. Targeting the Interface of Science and Technology

6 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SDSC Programs Slammer worm took over 75,000 computers in 10 minutes worldwide interrupting a Canadian election and disrupted internet trading on South Korean stock exchange Networking High End Computing New IBM “DataStar” will run at 7+ TFLOPS and support computing with terabyte data sets Protein Data Bank provides structure information on proteins, gets over 60 million hits/year Integrative Biosciences PSC Grids and Clusters SDSC concentrating international experience with Grids and Clusters through NPACI Grid, TeraGrid, PRAGMA Grid SDSC and Library of Congress to preserve 8 TB of “American Memory” and other historically important national collections. Data and Knowledge Systems Recommendations from SDSC and collaborators developing DB of undersea mountains (Seamounts) providing United Nations critical information on marine biodiversity. Integrative Computational Sciences

7 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Focus on Data DATA -- Killer App for the next generation Over the next decade, data will come from everywhere Scientific instruments Experiments Sensors and sensornets New devices (personal digital devices, computer-enabled clothing, cars, …) And be used by everyone Scientists Consumers Educators General public SW environment will need to support unprecedented diversity, globalization, integration, scale, and use Data from sensors Data from simulations Data from instruments Data from analysis

8 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE DisciplinaryDatabases Users Portals, Domain Specific APIs provide access to data Middleware federates data across disciplinary vocabularies Biology in 2000-2010: Integrating Data from Atoms to Organisms Organisms Organs Cells Atoms Biopolymers Organelles Cell Biology Anatomy Physiology Proteomics Medicinal Chemistry Genomics

9 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Infrastructure to manage, analyze and mine complex, heterogeneous, and bio-diverse geophysical data sets Situating 4D data in context – spatial, temporal, topic, process Supporting “deep” modeling of 4D data Providing for semantic integration of geosciences data Data Integration in the Geosciences What is the distribution and U/ Pb zircon ages of A-type plutons in VA? How about their 3-D geometry ? How does it relate to host rock structures? ? Data Integration Geologic Map (Virginia) GeoChemical GeoPhysical (gravity contours) GeoChronologic (Concordia ) Foliation Map (structure DB) Complex “multiple-worlds” mediation

10 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE From Data to Information to Knowledge Storage hardware Networked Storage (SAN) Grid Storage Filesystems, Database Systems Data Mining, Simulation Modeling, Analysis, Data Fusion Applications: Medical informatics, Biosciences, Ecoinformatics,… Knowledge-Based Integration Advanced Query Processing Visualization High speed networking sensornets SDSC Data and Knowledge Systems Program How do we configure computer architectures to optimally support data-oriented computing? How do we collect, access and organize data? How do we obtain usable information from data? How do we detect trends and relationships in data? How do we represent data, information and knowledge to the user? How do we combine data, knowledge and information management with simulation and modeling? instruments

11 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE GEON: Developing a national cyber- infrastructure for the Geosciences Core GEON IT Components Access Development of GEON Portal Modeling, analysis, simulation, data management Implementation of a Geosciences-focused GEON Grid (Linux clusters at some nodes) Will leverage NSF investment in TeraGrid Standards for information integration Data, metadata; ontology standards for database Web services Unified Geosciences Language System Will provide a unifying framework for understanding of Geoscience data Collaboration Close collaboration between geoscientists and technologists Visualization Advanced visualization of semantic structures for knowledge discovery Framework for data “transparency” on the Grid Location/Name: data can be anywhere on the Grid Distribution: data can be distributed across multiple locations Replication multiple copies of data Heterogeneity data can be in more than one format—file or DBMS, points or polygons, etc… (the difficult one) Ownership & Costing should not have to negotiate with each source individually

12 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Core leverage for other projects GEON and BIRN working to have common grid software stack GEON and SEEK will have common semantic integration services GEON is an application driver for OptIPuter Will field a common GIS and Viz center GEON is linked to Earthscope IRIS and UNAVCO are members of GEON. Both are primary data archives for Earthscope GEON will partner with DOE EarthSystem Grid (ES Grid) ES Grid has deployed grid data services across 5 sites USGS is a major GEON partner Perhaps a GEON node(s) at USGS GEON is a partner of Chronos (chronostratigraphy) GEON is a partner of CUAHSI (hydrologic information systems)

13 SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Developing an enabling technology for the Tree of Life Establish requirements What do we need the IT to enable? Prototype IT environment Feedback from users critical Harden into “production-ready” SW Lower boundaries for access, focus on reliability, scalability, ease-of-use Evolve and scale as user needs grow more sophisticated Design must allow for extensibility, user feedback and modification


Download ppt "SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Introduction to SDSC Fran Berman Director, SDSC and."

Similar presentations


Ads by Google