“Building an Information Infrastructure to Support Genetic Sciences"

Slides:



Advertisements
Similar presentations
OptIPuter Goal: Removing Bandwidth Barriers to e-Science ATLAS Sloan Digital Sky Survey LHC ALMA.
Advertisements

Cyber Metagenomics; Challenge to See The Unseen Majority in The Ocean
The Coming Revolution in Environmental Awareness
Presentation for the Microbe Project Interagency Team
"Cyberinfrastructure for Environmental Observations" Invited Talk to Symposium on Science and Technology in GEOSS: The Role of Universities Hosted by
Calit2-Living in the Future " Keynote Sharecase 2006 University of California, San Diego March 29, 2006 Dr. Larry Smarr Director, California Institute.
Calit2s Program in Nano-science, Nano-engineering, and Nano-medicine Invited Talk Review of Nano-cancer project April 11, 2006 Dr. Larry Smarr Director,
Bringing Mexico Into the Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber.
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA) Invited Talk CONNECT Board Meeting La Jolla, CA April 26, 2006.
High Resolution Multimedia in a Ultra Bandwidth World After Dinner Talk IEEE ISM2005 Irvine, CA December 13, 2005 Dr. Larry Smarr Director, California.
1 The Importance of Large-Scale Computer Science Research Efforts Talk at Public Seminar on Large-Scale NSF Research Efforts for the Future Computer Museum.
Collaborations Between Calit2, SIO, and the Venter Institutea Beginning " Talk to the Venter Institute Board La Jolla, CA December 5, 2005 Dr. Larry Smarr.
The CAMERA Project Metagenomics 2006 Oct 3-5, 2006 Paul Gilna, Calit2, UCSD.
Why Optical Networks Are Emerging as the 21 st Century Driver Scientific American, January 2001.
"The OptIPuter: an IP Over Lambda Testbed" Invited Talk NREN Workshop VII: Optical Network Testbeds (ONT) NASA Ames Research Center Mountain View, CA August.
Scaling-Up the BIRN Building on the BIRN Workshop National Institutes of Health Bethesda, MD March 22, 2004 Dr. Larry Smarr Director, California Institute.
AHM Overview OptIPuter Overview Third All Hands Meeting OptIPuter Project San Diego Supercomputer Center University of California, San Diego January 26,
Genomics at the Speed of Light: Understanding the Living Ocean The Gordon and Betty Moore Foundation 2nd Annual Marine Microbiology Investigator Symposium.
Microbial Metagenomics and Human Health Invited Talk Health Sciences Advisory Board School of Medicine University of California, San Diego May 8, 2006.
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA) Invited Keynote Annual Meeting CENIC 2006 Oakland, CA March 13,
“How LambdaGrids are Transforming Science" Keynote iGrid2005 La Jolla, CA September 29, 2005 Dr. Larry Smarr Director, California Institute.
“ OptIPuter Tech Transfer to the Broader e-Science and HPC Communities " OptIPuter All Hands Meeting La Jolla, CA December 20, 2006 Dr. Larry.
The OptIPuter Project: From the Grid to the LambdaGrid Invited Talk IEEE Orange County Computer Society Irvine, CA October 24, 2005 Dr. Larry Smarr Director,
Cal-(IT) 2 : A Public-Private Partnership in Southern California Tech Coast Angels Invited Talk November 11, 2003 Faculty Club, UC San Diego Dr. Larry.
Arkansas Research and Education Optical Network Arkansas Association of Public Universities Little Rock, Arkansas April 10, 2008 Dr. Robert Zimmerman ARE-ON.
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Center for Earth Observations and Applications Advisory Committee.
Presentation Title April 4, 2002 CAMERA- Metagenomics meets the Cyberinfrastructure David T. Kingsbury Gordon and Betty Moore Foundation BERAC - October.
Genomics at the Speed of Light: Understanding the Living Ocean Invited Talk JASON Summer Program La Jolla, CA July 12, 2006 Dr. Larry Smarr Director, California.
Bringing Mexico Into the Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber.
“Mapping the Human Gut Microbiome in Health and Disease Using Sequencing, Supercomputing, and Data Analysis” Invited Talk Delivered by Mehrdad Yazdani,
Why Optical Networks Will Become the 21 st Century Driver Scientific American, January 2001 Number of Years Performance per Dollar Spent Data Storage.
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Invited Talk 2006 Synthetic Biology Symposium Aliso Creek Inn.
“ NCSA and Telepresence Collaboration ” Remote Telepresence Talk to The 2006 NCSA Private Sector Program Annual Meeting In Honor of John Stevenson’s Retirement.
“Calit2: A UC Experiment for Living in the Future" Talk to UCSD Near You La Jolla, CA April 11, 2006 Dr. Larry Smarr Director, California Institute.
Education in a Globally Connected World Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber.
Developing a North American Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E.
“Comparative Human Microbiome Analysis” Remote Video Talk to CICESE Big Data, Big Network Workshop Ensenada, Mexico October 10, 2013 Dr. Larry Smarr Director,
“ High Performance Collaboration – The Jump to Light Speed " Talk to A Visiting Team from Intel June 25, 2006 Dr. Larry Smarr Director, California.
GENI GEC 15 Bonnie Hurst Experimental Support Service
Cal-(IT) 2 : A Public-Private Partnership in Southern California U.S. Business Council for Sustainable Development Year-End Meeting December 11, 2003 Institute.
Introduction to Calit2 Visit by NASA Ames February 29, 2008 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology.
Innovative Research Alliances Invited Talk IUCRP Fellows Seminar UCSD La Jolla, CA July 10, 2006 Dr. Larry Smarr Director, California Institute for Telecommunications.
“Metagenomics Over Lambdas: Update on the CAMERA Project" Invited Talk 6 th Annual ON*VECTOR International Photonics Workshop UCSD February 27,
“Cyberinfrastructure for Ocean Cabled Observatories" Invited Talk NEPTUNE Regional Cabled Ocean Observatory Workshop Seattle, WA November 15, 2005 Dr.
Update on Calit2 Briefing to External Relations Staff UC San Diego La Jolla, CA November 8, 2005 Dr. Larry Smarr Director, California Institute for Telecommunications.
Using Photonics to Prototype the Research Campus Infrastructure of the Future: The UCSD Quartzite Project Philip Papadopoulos Larry Smarr Joseph Ford Shaya.
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Invited Talk Metagenomics 2006 UCSD La Jolla, CA October.
“Living in a Microbial World” Global Health Program Council on Foreign Relations New York, NY April 10, 2014 Dr. Larry Smarr Director, California Institute.
Copyright 2004 National LambdaRail, Inc N ational L ambda R ail Update 9/28/2004 Debbie Montano Director, Development & Operations
“ Genomic Research: The Jump to Light Speed " Invited Talk Genomes, Medicine, and the Environment Conference 2005 Hilton Head, SC October 19, 2005 Dr.
“ Calit2-Living in the Future " Briefing The Future in Review (FiRe) 2006 Conference University of California, San Diego May 15, 2006 Dr. Larry Smarr Director,
Ocean Sciences Cyberinfrastructure Futures Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technologies Harry E.
“ Collaborations Between Calit2, SIO, and the Venter Institute—a Beginning " Talk to the UCSD Representative Assembly La Jolla, CA November 29, 2005 Dr.
“CAMERA Goes Live!" Presentation with Craig Venter National Press Club Washington, DC March 13, 2007 Dr. Larry Smarr Director, California Institute for.
“The UCSD Big Data Freeway System” Invited Short Talk Workshop on “Enriching Human Life and Society” UC San Diego February 6, 2014 Dr. Larry Smarr Director,
“ OptIPuter Year Five: From Research to Adoption " OptIPuter All Hands Meeting La Jolla, CA January 22, 2007 Dr. Larry Smarr Director, California.
es/by-sa/2.0/. Metagenomics Prof:Rui Alves Dept Ciencies Mediques Basiques, 1st Floor, Room.
“Genomics: The CAMERA Project" Invited Talk 5 th Annual ON*VECTOR International Photonics Workshop UCSD February 28, 2006 Dr. Larry Smarr Director,
University of Illinois at Chicago Lambda Grids and The OptIPuter Tom DeFanti.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
Integrate access to advanced computational resources and high-level services (resource scheduling, automated data management) to accelerate and improve.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
“OptIPuter: From the End User Lab to Global Digital Assets" Panel UC Research Cyberinfrastructure Meeting October 10, 2005 Dr. Larry Smarr.
“ Building an Information Infrastructure to Support Microbial Metagenomic Sciences " Presentation to the NBCR Research Advisory Committee UCSD La Jolla,
July 19, 2005-LHC GDB T0/T1 Networking L. Pinsky--ALICE-USA1 ALICE-USA T0/T1 Networking Plans Larry Pinsky—University of Houston For ALICE-USA.
Invited Talk Metagenomics 2006 UCSD La Jolla, CA October 4, 2006
Joslynn Lee – Data Science Educator
The OptIPuter Project: From the Grid to the LambdaGrid
Optical SIG, SD Telecom Council
The OptIPortal, a Scalable Visualization, Storage, and Computing Termination Device for High Bandwidth Campus Bridging Presentation by Larry Smarr to.
Presentation transcript:

“Building an Information Infrastructure to Support Genetic Sciences" Invited Talk Celebrating a Decade of Genome Sequencing UCSD La Jolla, CA December 6, 2005 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology; Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD

The Sargasso Sea Experiment The Power of Environmental Metagenomics Yielded a Total of Over 1 billion Base Pairs of Non-Redundant Sequence Displayed the Gene Content, Diversity, & Relative Abundance of the Organisms Sequences from at Least 1800 Genomic Species, including 148 Previously Unknown Identified over 1.2 Million Unknown Genes J. Craig Venter, et al. Science 2 April 2004: Vol. 304. pp. 66 - 74 MODIS-Aqua satellite image of ocean chlorophyll in the Sargasso Sea grid about the BATS site from 22 February 2003

GenBank Protein Data Bank Genomic Data Is Growing Rapidly, But Metagenomics Will Vastly Increase The Scale… 100 Billion Bases! 35,000 Structures GenBank Protein Data Bank www.ncbi.nlm.nih.gov/Genbank www.rcsb.org/pdb/holdings.html Total Data < 1TB

Metagenomics Will Couple to Earth Observations Which Add Several TBs/Day Source: Glenn Iona, EOSDIS Element Evolution Technical Working Group January 6-7, 2005

Internet2 Backbone is 10,000 Mbps! Throughput is < 0.5% to End User Challenge: Average Throughput of NASA Data Products to End User is < 50 Mbps Tested October 2005 Internet2 Backbone is 10,000 Mbps! Throughput is < 0.5% to End User http://ensight.eos.nasa.gov/Missions/icesat/index.shtml

Why Optical Networks Will Become the 21st Century Driver Optical Fiber (bits per second) (Doubling time 9 Months) Data Storage (bits per square inch) (Doubling time 12 Months) Performance per Dollar Spent Silicon Computer Chips (Number of Transistors) (Doubling time 18 Months) 1 2 3 4 5 Number of Years Scientific American, January 2001

Solution: Individual 1 or 10Gbps Lightpaths -- “Lambdas on Demand” (WDM) “Lambdas” Source: Steve Wallach, Chiaro Networks

National Lambda Rail (NLR) and TeraGrid Provides Cyberinfrastructure Backbone for U.S. Researchers NSF’s TeraGrid Has 4 x 10Gb Lambda Backbone Seattle International Collaborators Portland Boise UC-TeraGrid UIC/NW-Starlight Ogden/ Salt Lake City Cleveland Chicago New York City San Francisco Denver Pittsburgh Washington, DC Kansas City Raleigh Albuquerque Tulsa Los Angeles Atlanta San Diego Phoenix Dallas Baton Rouge Las Cruces / El Paso Links Two Dozen State and Regional Optical Networks Jacksonville Pensacola DOE, NSF, & NASA Using NLR San Antonio Houston NLR 4 x 10Gb Lambdas Initially Capable of 40 x 10Gb wavelengths at Buildout

Calit2@UCSD Is Connected to the World at 10,000 Mbps Maxine Brown, Tom DeFanti, Co-Chairs i Grid 2005 T H E G L O B A L L A M B D A I N T E G R A T E D F A C I L I T Y www.igrid2005.org September 26-30, 2005 Calit2 @ University of California, San Diego California Institute for Telecommunications and Information Technology 50 Demonstrations, 20 Counties, 10 Gbps/Demo

Canadian-U.S. Collaboration Prototyping Cabled Ocean Observatories Enabling High Definition Video Exploration of Deep Sea Vents Canadian-U.S. Collaboration Source John Delaney & Deborah Kelley, UWash

A Near Future Metagenomics Fiber Optic Cable Observatory Source John Delaney, UWash

1200 Researchers in Two Buildings Calit2 Brings Computer Scientists and Engineers Together with Biomedical Researchers Some Areas of Concentration: Metagenomics Genomic Analysis of Organisms Evolution of Genomes Cancer Genomics Human Genomic Variation and Disease Mitochondrial Evolution Proteomics Computational Biology Information Theory and Biological Systems UC Irvine UC San Diego 1200 Researchers in Two Buildings

Driving Cyberinfrastructure with Environmental Metagenomics Samples Collected by Sorcerer II Approved Yesterday!

Marine Microbial Metagenomics From Species Genomes to Ecological Genomes Each Sequence is a Part of an Entire Biological Community Complex Data Set Including Sequences, Genes and Gene Families, Coupled With Environmental Metadata Tremendous Potential to Better Understand the Functioning of Natural Ecosystems Challenge Powerful Information Infrastructure Required to Support Metagenomics and to Create Co-laboratories Scripps Genome Center

Source: Karin Remington J. Craig Venter Institute Metagenomics “Extreme Assembly” Requires Large Amount of Pixel Real Estate Prochlorococcus Microbacterium Burkholderia Rhodobacter SAR-86 unknown Source: Karin Remington J. Craig Venter Institute

Source: Karin Remington J. Craig Venter Institute Metagenomics Requires a Global View of Data and the Ability to Zoom Into Detail Interactively Overlay of Metagenomics Data onto Sequenced Reference Genomes (This Image: Prochloroccocus marinus MED4) Source: Karin Remington J. Craig Venter Institute

The OptIPuter – Creating High Resolution Portals Over Dedicated Optical Channels to Global Science Data 300 MPixel Image! Source: Mark Ellisman, David Lee, Jason Leigh Green: Purkinje Cells Red: Glial Cells Light Blue: Nuclear DNA Calit2 (UCSD, UCI) and UIC Lead Campuses—Larry Smarr PI Partners: SDSC, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST

Scalable Displays Allow Both Global Content and Fine Detail Source: Mark Ellisman, David Lee, Jason Leigh 30 MPixel SunScreen Display Driven by a 20-node Sun Opteron Visualization Cluster

Allows for Interactive Zooming from Cerebellum to Individual Neurons Source: Mark Ellisman, David Lee, Jason Leigh

Calit2 Intends to Jump Beyond Traditional Web-Accessible Databases W E B PORTAL (pre-filtered, queries metadata) Data Backend (DB, Files) Request Response BIRN PDB NCBI Genbank + many others Source: Phil Papadopoulos, SDSC, Calit2

Calit2’s Direct Access Core Architecture Will Create Next Generation Metagenomics Server Sargasso Sea Data Sorcerer II Expedition (GOS) JGI Community Sequencing Project Moore Marine Microbial Project NASA Goddard Satellite Data Traditional User Dedicated Compute Farm (100s of CPUs) Flat File Server Farm W E B PORTAL Request Data- Base Farm 10 GigE Fabric Response + Web Services Web (other service) Local Cluster Environment Direct Access Lambda Cnxns TeraGrid: Cyberinfrastructure Backplane (scheduled activities, e.g. all by all comparison) (10000s of CPUs) Source: Phil Papadopoulos, SDSC, Calit2

Analysis Data Sets, Data Services, Tools, and Workflows Assemblies of Metagenomic Data e.g, GOS, JGI CSP Annotations Genomic and Metagenomic Data “All-against-all” alignments of ORFs Updated Periodically Gene Clusters and associated data Profiles, Multiple-Sequence Alignments, HMMs, Phylogenies, Peptide Sequences Data Services ‘Raw’ and specialized analysis data Rich query facilities Tools and Workflows Navigate and Sift Raw and Analysis Data Publish Workflows and Develop New Ones Prioritize Features via Dialogue with Community Source: Saul Kravitz Director of Software Engineering J. Craig Venter Institute

The OptIPuter Enabled Collaboratory: Remote Researchers Jointly Exploring Complex Data Source: Mark Ellisman, NCMIR Calit2/EVL/NCMIR Tiled Displays with HD Video New Home of SDSC/Calit2 Synthesis Center Source: Chaitan Baru, SDSC

Eliminating Distance to Unify Remote Laboratories www.calit2.net/articles/article.php?id=660 August 8, 2005 25 Miles Venter Institute SIO/UCSD OptIPuter Visualized Data NASA Goddard HDTV Over Lambda

Science Falkowski and Vargas 304 (5667): 58 Looking Back Nearly 4 Billion Years In the Evolution of Microbe Genomics Science Falkowski and Vargas 304 (5667): 58