Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Invited Talk 2006 Synthetic Biology Symposium Aliso Creek Inn.

Slides:



Advertisements
Similar presentations
Cyber Metagenomics; Challenge to See The Unseen Majority in The Ocean
Advertisements

The Coming Revolution in Environmental Awareness
Sequencing Genomics: The New Big Data Driver IntermezzoTalk SURFnet7, Part of GigaPort3 Utrecht, Netherlands December 7, 2011 Dr. Larry Smarr Director,
Calit2-Living in the Future " Keynote Sharecase 2006 University of California, San Diego March 29, 2006 Dr. Larry Smarr Director, California Institute.
Calit2s Program in Nano-science, Nano-engineering, and Nano-medicine Invited Talk Review of Nano-cancer project April 11, 2006 Dr. Larry Smarr Director,
Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury.
“Tracking Immune Biomarkers and the Human Gut Microbiome: Inflammation, Crohn's Disease, and Colon Cancer” USC Monthly Seminar Series Physical Sciences.
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA) Invited Talk CONNECT Board Meeting La Jolla, CA April 26, 2006.
Exploring Our Inner Universe Using Supercomputers and Gene Sequencers Physics Department Colloquium UC San Diego October 24, 2013 Dr. Larry Smarr Director,
1 The Importance of Large-Scale Computer Science Research Efforts Talk at Public Seminar on Large-Scale NSF Research Efforts for the Future Computer Museum.
The CAMERA Project Metagenomics 2006 Oct 3-5, 2006 Paul Gilna, Calit2, UCSD.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
Why Optical Networks Are Emerging as the 21 st Century Driver Scientific American, January 2001.
JGI Timeline 1997 JGI April 2003 Human Genome Program Officially Ended Human Genome Program Officially Launched 1990 Joint Genome Institute ………………….(JGI)
DESIGNING THE MICROBIAL RESEARCH COMMONS: AN INTERNATIONAL SYMPOSIUM NATIONAL ACADEMY OF SCIENCES, WASHINGTON, DC, 8-9 OCTOBER 2009 Paul Gilna, B.Sc.,
Genomics at the Speed of Light: Understanding the Living Ocean The Gordon and Betty Moore Foundation 2nd Annual Marine Microbiology Investigator Symposium.
Microbial Metagenomics and Human Health Invited Talk Health Sciences Advisory Board School of Medicine University of California, San Diego May 8, 2006.
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA) Invited Keynote Annual Meeting CENIC 2006 Oakland, CA March 13,
6/10/2015 ©T. C. Hazen #1 Center for Environmental Biotechnology Center for Environmental Biotechnology Rapid deduction of bacteria stress response pathways:
C A M E R A A Metagenomics Resource for Marine Microbial Ecology July 27, 2007 Paul Gilna UCSD/Calit2 Saul A. Kravitz J. Craig Venter Institute.
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
“Personalized Medicine, Colorectal Cancer and Gut Bacteria”
“ OptIPuter Tech Transfer to the Broader e-Science and HPC Communities " OptIPuter All Hands Meeting La Jolla, CA December 20, 2006 Dr. Larry.
The OptIPuter Project: From the Grid to the LambdaGrid Invited Talk IEEE Orange County Computer Society Irvine, CA October 24, 2005 Dr. Larry Smarr Director,
Arkansas Research and Education Optical Network Arkansas Association of Public Universities Little Rock, Arkansas April 10, 2008 Dr. Robert Zimmerman ARE-ON.
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Center for Earth Observations and Applications Advisory Committee.
The BIO Directorate Microbial Biology Emphasis BIO Advisory Committee April, 2005.
Presentation Title April 4, 2002 CAMERA- Metagenomics meets the Cyberinfrastructure David T. Kingsbury Gordon and Betty Moore Foundation BERAC - October.
Genomics at the Speed of Light: Understanding the Living Ocean Invited Talk JASON Summer Program La Jolla, CA July 12, 2006 Dr. Larry Smarr Director, California.
National LambdaRail A Fiber-based Research Infrastructure Vice-Provost for Scholarly Technology University of Southern California Chair of the CENIC Board.
Bringing Mexico Into the Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber.
“Mapping the Human Gut Microbiome in Health and Disease Using Sequencing, Supercomputing, and Data Analysis” Invited Talk Delivered by Mehrdad Yazdani,
Genomics Eddy Rubin December 3, DOE Joint Genome Mission Supported by the DOE Office of Science, JGI unites the expertise of five national laboratories—Lawrence.
“Toward Novel Human Microbiome Surveillance Diagnostics to Support Public Health” Invited Talk Institute for Public Health University of California San.
“An Integrated Science Cyberinfrastructure for Data-Intensive Research” Panel CISCO Executive Symposium San Diego, CA June 9, 2015 Dr. Larry Smarr Director,
“ NCSA and Telepresence Collaboration ” Remote Telepresence Talk to The 2006 NCSA Private Sector Program Annual Meeting In Honor of John Stevenson’s Retirement.
“Calit2: A UC Experiment for Living in the Future" Talk to UCSD Near You La Jolla, CA April 11, 2006 Dr. Larry Smarr Director, California Institute.
Education in a Globally Connected World Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber.
Developing a North American Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E.
“Comparative Human Microbiome Analysis” Remote Video Talk to CICESE Big Data, Big Network Workshop Ensenada, Mexico October 10, 2013 Dr. Larry Smarr Director,
“ High Performance Collaboration – The Jump to Light Speed " Talk to A Visiting Team from Intel June 25, 2006 Dr. Larry Smarr Director, California.
Advancing Science with DNA Sequence Undergraduate Genomics in a Research University Environment A Collaborative Effort between the JGI and UC Merced M.
GENI GEC 15 Bonnie Hurst Experimental Support Service
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
Innovative Research Alliances Invited Talk IUCRP Fellows Seminar UCSD La Jolla, CA July 10, 2006 Dr. Larry Smarr Director, California Institute for Telecommunications.
“Metagenomics Over Lambdas: Update on the CAMERA Project" Invited Talk 6 th Annual ON*VECTOR International Photonics Workshop UCSD February 27,
“Cyberinfrastructure for Ocean Cabled Observatories" Invited Talk NEPTUNE Regional Cabled Ocean Observatory Workshop Seattle, WA November 15, 2005 Dr.
Using Photonics to Prototype the Research Campus Infrastructure of the Future: The UCSD Quartzite Project Philip Papadopoulos Larry Smarr Joseph Ford Shaya.
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Invited Talk Metagenomics 2006 UCSD La Jolla, CA October.
“Living in a Microbial World” Global Health Program Council on Foreign Relations New York, NY April 10, 2014 Dr. Larry Smarr Director, California Institute.
Copyright 2004 National LambdaRail, Inc N ational L ambda R ail Update 9/28/2004 Debbie Montano Director, Development & Operations
A High-Performance Campus-Scale Cyberinfrastructure For Effectively Bridging End-User Laboratories to Data-Intensive Sources Presentation by Larry Smarr.
“ Genomic Research: The Jump to Light Speed " Invited Talk Genomes, Medicine, and the Environment Conference 2005 Hilton Head, SC October 19, 2005 Dr.
“Observing the Dynamics of the Human Immune System Coupled to the Microbiome in Health and Disease” CASIS Workshop on Biomedical Research Aboard the ISS.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
“ Collaborations Between Calit2, SIO, and the Venter Institute—a Beginning " Talk to the UCSD Representative Assembly La Jolla, CA November 29, 2005 Dr.
“CAMERA Goes Live!" Presentation with Craig Venter National Press Club Washington, DC March 13, 2007 Dr. Larry Smarr Director, California Institute for.
“The UCSD Big Data Freeway System” Invited Short Talk Workshop on “Enriching Human Life and Society” UC San Diego February 6, 2014 Dr. Larry Smarr Director,
“ OptIPuter Year Five: From Research to Adoption " OptIPuter All Hands Meeting La Jolla, CA January 22, 2007 Dr. Larry Smarr Director, California.
es/by-sa/2.0/. Metagenomics Prof:Rui Alves Dept Ciencies Mediques Basiques, 1st Floor, Room.
Lecture Science & Entertainment Exchange National Academy of Sciences Los Angeles June 13, 2013 Dr. Larry Smarr Director, California Institute for Telecommunications.
“Adding Consumer-Generated and Microbiome Data to the Electronic Medical Record” Using Big Data to Advance Healthcare Panel National Health Policy Conference.
“Genomics: The CAMERA Project" Invited Talk 5 th Annual ON*VECTOR International Photonics Workshop UCSD February 28, 2006 Dr. Larry Smarr Director,
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
“OptIPuter: From the End User Lab to Global Digital Assets" Panel UC Research Cyberinfrastructure Meeting October 10, 2005 Dr. Larry Smarr.
“ Building an Information Infrastructure to Support Microbial Metagenomic Sciences " Presentation to the NBCR Research Advisory Committee UCSD La Jolla,
July 19, 2005-LHC GDB T0/T1 Networking L. Pinsky--ALICE-USA1 ALICE-USA T0/T1 Networking Plans Larry Pinsky—University of Houston For ALICE-USA.
Invited Talk Metagenomics 2006 UCSD La Jolla, CA October 4, 2006
“Building an Information Infrastructure to Support Genetic Sciences"
Metagenomics Microbial community DNA extraction
Presentation transcript:

Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Invited Talk 2006 Synthetic Biology Symposium Aliso Creek Inn Laguna Beach, CA September 15, 2006 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD

Calit2 Brings Computer Scientists and Engineers Together with Biomedical Researchers Some Areas of Concentration: –Metagenomics –Genomic Analysis of Organisms –Evolution of Genomes –Cancer Genomics –Human Genomic Variation & Disease –Proteomics –Mitochondrial Evolution –Computational Biology & Bioinformatics –Information Theory & Biological Systems UC San Diego UC Irvine 1200 Researchers in Two Buildings

Most of Evolutionary Time Was in the Microbial World You Are Here Source: Carl Woese, et al Tree of Life Derived from 16S rRNA Sequences

Microbial Genomics Let’s Us Look Back Nearly 4 Billion Years In the Evolution of Life Falkowski and Vargas Science 304 (5667) 2004

Moore Microbial Genome Sequencing Project Selected Microbes Throughout the World’s Oceans Microbes Nominated by Leading Ocean Microbial Biologists

Moore Foundation Funded the Venter Institute to Provide the Full Genome Sequence of 150 Marine Microbes

Moore Microbial Genome Sequencing Project: Cyanobacteria Being Sequenced by Venter Institute

Full Genome Sequencing is Exploding: Most Sequenced Genomes are Bacterial Total 422 Completed Genomes Total 1665 Ongoing Genomes 55 Metagenomes First Genome Genomes/ Year 2000 Moore 155 In Here

Microbial Metagenomics is a Rapidly Emerging Field of Research “Despite their ubiquity, relatively little is known about the majority of environmental microorganisms, largely because of their resistance to culture under standard laboratory conditions.” “The application of high-throughput shotgun sequencing environmental samples has recently provided global views of those communities not obtainable from 16S rRNA or BAC clone– sequencing surveys.” Comparative Metagenomics of Microbial Communities Susannah Green Tringe, Christian von Mering, Arthur Kobayashi, Asaf A. Salamov, Kevin Chen, Hwai W. Chang, Mircea Podar, Jay M. Short, Eric J. Mathur, John C. Detter, Peer Bork, Philip Hugenholtz, Edward M. Rubin Science 22 April 2005

The Sargasso Sea Experiment The Power of Environmental Metagenomics Yielded a Total of Over 1 billion Base Pairs of Non-Redundant Sequence Displayed the Gene Content, Diversity, & Relative Abundance of the Organisms Sequences from at Least 1800 Genomic Species, including 148 Previously Unknown Identified over 1.2 Million Unknown Genes MODIS-Aqua satellite image of ocean chlorophyll in the Sargasso Sea grid about the BATS site from 22 February 2003 J. Craig Venter, et al. Science 2 April 2004: Vol pp

Marine Genome Sequencing Project – Measuring the Genetic Diversity of Ocean Microbes Sorcerer II Data Will Double Number of Proteins in GenBank!

GOS Sequences are Largely Bacterial Source: Shibu Yooseph, et al. (PLOS Biology in press 2006) ~3 Million Previously Known Sequences ~5.6 Million GOS Sequences

GOS Analysis -- Protein Families in Nature Have Been Poorly Explored Thus Far Novel Sequence Similarity Clustering Process Predicts Proteins and Groups Related Sequences Into Clusters (Families) GOS Proteins Increase Size / Diversity of Many Protein Families 1,700 Novel GOS-Only Clusters Identified (>20 per Cluster) –10% of 17,000 Clusters Source: Shibu Yooseph, Granger Sutton, --JCVI NCBI_nr GOS + NCBI_nr + Ensembl + TIGR Gene Indices + Prokaryotic Genomes

Current Universe of Medium/ Large Protein Families Source: Shibu Yooseph, et al. (PLOS Biology in press 2006) Protein Families Conserved Across Tree of Life Protein Families Unique to GOS 17,067 Protein Family Clusters

Metagenomic Data Sets Are Rapidly Being Accumulated “A majority of the bacterial sequences corresponded to uncultivated species and novel microorganisms.” “We discovered significant inter-subject variability.” “Characterization of this immensely diverse ecosystem is the first step in elucidating its role in health and disease.” “Diversity of the Human Intestinal Microbial Flora” Paul B. Eckburg, et al Science (10 June 2005) 395 Phylotypes

Microbes Form the Base of the Living World White Filamentous Bacteria on 'Pill Bug' Outer Carapace 1 cm. Source: John Delaney and Research Channel, U Washington High Definition Still Frame of Hydrothermal Vent Ecology 2.3 Km Deep

PI Larry Smarr Announced January 17, 2006 $24.5M Over Seven Years

Paul Gilna Has Been Recruited from Los Alamos to Become Calit2’s Executive Director of CAMERA Formerly –Former Director of the Department of Energy’s Joint Genome Institute (JGI) Operations at Los Alamos National Laboratory (LANL) –Group Leader of Genomic Science and Computational Biology in LANL’s Bioscience Division JGI –A $70-million-per-Year Collaboration: –Lawrence Berkeley, –Lawrence Livermore, –Los Alamos, –Oak Ridge, and –Pacific Northwest –and the Stanford Human Genome Center –Working at The Frontiers of Genome Sequencing and Biosciences

San Francisco Pittsburgh Cleveland National Lambda Rail (NLR) and TeraGrid Provides Cyberinfrastructure Backbone for U.S. Researchers San Diego Los Angeles Portland Seattle Pensacola Baton Rouge Houston San Antonio Las Cruces / El Paso Phoenix New York City Washington, DC Raleigh Jacksonville Dallas Tulsa Atlanta Kansas City Denver Ogden/ Salt Lake City Boise Albuquerque UC-TeraGrid UIC/NW-Starlight Chicago International Collaborators NLR 4 x 10Gb Lambdas Initially Capable of 40 x 10Gb wavelengths at Buildout NSF’s TeraGrid Has 4 x 10Gb Lambda Backbone Links Two Dozen State and Regional Optical Networks DOE, NSF, & NASA Using NLR

Flat File Server Farm W E B PORTAL Traditional User Response Request Dedicated Compute Farm (100s of CPUs) TeraGrid: Cyberinfrastructure Backplane (scheduled activities, e.g. all by all comparison) (10000s of CPUs) Web (other service) Local Cluster Local Environment Direct Access Lambda Cnxns Data- Base Farm 10 GigE Fabric Calit2’s Direct Access Core Architecture Will Create Next Generation Metagenomics Server Source: Phil Papadopoulos, SDSC, Calit2 + Web Services Sargasso Sea Data Sorcerer II Expedition (GOS) JGI Community Sequencing Project Moore Marine Microbial Project NASA and NOAA Satellite Data Community Microbial Metagenomics Data

The Future Home of the Moore Foundation Funded Marine Microbial Ecology Metagenomics Complex First Implementation of the CAMERA Complex Photo Courtesy Joe Keefe, Calit2 Major Buildout of Calit2 Server Room Underway

Analysis Data Sets, Data Services, Tools, and Workflows Assemblies of Metagenomic Data –e.g, GOS, JGI CSP Annotations –Genomic and Metagenomic Data “All-against-all” Alignments of ORFs –Updated Periodically Gene Clusters and Associated Data –Profiles, Multiple-Sequence Alignments, –HMMs, Phylogenies, Peptide Sequences Data Services –‘Raw’ and Specialized Analysis Data –Rich Query Facilities Tools and Workflows –Navigate and Sift Raw and Analysis Data –Publish Workflows and Develop New Ones –Prioritize Features via Dialogue with Community Source: Saul Kravitz Director of Software Engineering J. Craig Venter Institute

OptIPortal–Termination Device for the Dedicated Gigabit/sec Lightpaths Photo Source: David Lee, Mark Ellisman NCMIR, UCSD Collaborative Analysis of Large Scale Images of Cancer Cells Integration of High Definition Video Streams with Large Scale Image Display Walls

Dedicated 10 Gbps CAVEWave Connects San Diego to Seattle to Chicago to Washington D.C. NEW! SunLight CICESE UW JCVI MIT SIO UCSD SDSU UIC EVL UCI OptIPortals Emerging OptIPortal Sites on the National LambdaRail

CAMERA Outreach Modes Scientific Advisory Board –Early Adopters – OptIPortal End Points Targeted Workshops –User Forums –User Software Testing –Viz Tool Brainstorming Presentations at Scientific Meetings –e.g. Demonstration Booth at JCVI Genomes, Medicine, and the Environment Conference October 2006 Partnerships With Metagenomics Projects –E.g. DoE’s Joint Genome Institute (JGI) Training and User Services Team

Timeline: Sprint and Marathon Sprint –Release 0.0: April 2006 –Test Cluster for UCSD/JCVI Collaboration –Release 1.0: Late Fall 2006 –Initial Data and Core Tools Release –Supports Publication of GOS Papers Marathon –Release 2.0: Fall 2007 –Additional/Improved Tools & Better Usability –Beyond 2.0 –Move Towards Semantic DB –Additional Tools Based on Community Feedback