Genomes to Fields 2014 Workshop Maize Phenotypic Information Platform Carolyn Lawrence Chicago, IL December 10.

Slides:



Advertisements
Similar presentations
1 POPcorn: Project Portal for corn A set of project and sequence-indexed data searching resources.
Advertisements

Diversity Data at MaizeGDB Ethalinda KS Cannon 1, Carson M. Andorf 2, Bremen L. Braun 2, Darwin A. Campbell 2, Mary L. Schaeffer 3,4, Cheng-Ting Yeh 5,
Project L.O.F.T. Report May 2007 through October 2007 Creating a design to meet stakeholder desires and dissolve our current set of interacting problems.
Maines Sustainability Solutions Initiative (SSI) Focuses on research of the coupled dynamics of social- ecological systems (SES) and the translation of.
Genomes to Fields Initiative
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
Plant Phenotype Pilot Project AIM: To use ontologies in express and analyze plant phenotypes from multiple species The Issue: Traditional free text phenotype.
Publish or perish? Linking Scratchpads and the new Biodiversity Data Journal for streamlining publication of botanical data D.N Koureas 1, L. Penev 2 &
Paula Mabee, University of South Dakota Eva Huala, Carnegie Institution for Science Andy Deans, North Carolina State University Suzanna Lewis, Lawrence.
Open Library Environment Designing technology for the way libraries really work November 19, 2008 ~ ASERL, Atlanta Lynne O’Brien Director, Academic Technology.
Peer Assessment of 5-year Performance ARS National Program 301: Plant, Microbial and Insect Genetic Resources, Genomics and Genetic Improvement Summary.
Office of Science Office of Biological and Environmental Research Susan K. Gregurick, Ph.D. Program Manager Computational Biology & Bioinformatics Biological.
Integrative and Comparative Biology 2009 C. Schwenk, D.K. Padilla, G.S. Bakken, R.J. Full.
Gene Expression Resources Available from MaizeGDB Kokulapalan Wimalanathan 1, Jack Gardiner 4 5, Bremen Braun 2, Ethalinda KS Cannon 4, Mary Schaeffer.
Join us in Summer 2008! Plant IT : Careers, Cases, and Collaborations Ethel Stanley BioQUEST Curriculum Consortium Claire Hemingway Botanical Society of.
Linking collections to related resources: Multi-scale, multi-dimensional, multi-disciplinary collaborative research in biodiversity. Is this a “Big.
1 Open Library Environment Designing technology for the way libraries really work December 8, 2008 ~ CNI, Washington DC Lynne O’Brien Director, Academic.
Drivers for a PRAGMA Biodiversity Science Expedition Reed Beaman Florida Museum of Natural History University of Florida.
E-BIOGENOUEST: A REGIONAL LIFE SCIENCES INITIATIVE FOR DATA INTEGRATION Datacite Annual Conference Nancy Olivier Collin – IRISA/INRIA
BISQUE: Enabling Cloud and Grid Powered Image Analysis Ramona Walls iPlant Collaborative
Introducing NRSP10 Database Infrastructure for Specialty Crops Computer Applications in Horticulture/Teaching Methods Workshop ASHS Annual Conference 2015.
Transboundary Conservation Governance: Key Principles & Concepts Governance of Transboundary Conservation Areas WPC, Sydney, 17 November 2014 Matthew McKinney.
Supporting the local research data environment via cross-campus collaboration and leveraging of national expertise Hannah F. Norton, Rolando Garcia Milian,
Organized by MGEC. It is the mission of the Maize Genetics Executive Committee to identify both the needs and the opportunities for maize genetics, and.
Hackathons for Scientific Software How and When do they Work? Erik H. Trainer, Chalalai Chaihirunkarn, Arun Kalyanasundaram, James D. Herbsleb.
Preserving the Scientific Record: Preserving a Record of Environmental Change Matthew Mayernik National Center for Atmospheric Research Version 1.0 [Review.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Plant Breeding Pipelines in the CCRP. Crucifers: Broccoli Brussels sprouts Cabbage Cauliflower Chinese cabbage Collards Kale Mustard Radish Rutabaga Turnip.
TAIR Workshop Model Organism Databases and Community Annotation Plant and Animal Genome XVI Conference, San Diego January 13, 2008.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Enabling Cloud and Grid Powered Image Phenotyping Martha Narro iPlant Collaborative Adapted.
Joint agINFRA & SCI-BUS workshop, 30/05/2013, Budapest, Hungary FP 7-INFRASTRUCTURES programme agINFRA Joint agINFRA & SCI-BUS workshop agINFRA.
Diversity Bioinformatics Terry Casstevens Institute for Genomic Diversity, Cornell University GMOD Meeting at NESCent Durham, NC – June 29-30, 2006.
The Plant Genome Research Program BIO AC Meeting November 17, 2005 Machi F. Dilworth DD/DBI What are the research questions being supported for the activity.
IPlant Collaborative Hands-on Cyberinfrastructure Workshop – Part 2 R. Walls University of Arizona Biodiversity Information Standards (TDWG) Sep. 29, 2015,
Rapid method to identify the mutated gene responsible for a trait A systems approach to understand biological mechanism High throughput sequencing to develop.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
NextGen Pipeline: Enabling the Plant Science Community Tom Brutnell (lead), Steve Rounsley (co-lead), Matt Vaughn (Engagement Lead) Ed Buckler, Justin.
Community Curation at Carolyn J. Lawrence, MaizeGDB Biological Analyst Trent E. Seigfried, MaizeGDB Database Manager Mary Polacco, USDA-ARS Collaborator.
NRSP10 Database Resources for Crop Genomics, Genetics and Breeding Research NRSP Crops Breeders Database Needs Focus Group Meeting July 30, 2015 Pullman,
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Gramene: Interactions with NSF Project on Molecular and Functional Diversity in the Maize Genome Maize PIs (Doebley, Buckler, Fulton, Gaut, Goodman, Holland,
MaizeGDB: A Very Short Overview of a Database Resource for Biological Information on Zea mays Jack M. Gardiner ASPB 2010.
Jake F. Weltzin United States Geological Survey USA National Phenology Network Integrating phenology data across spatial and temporal scales.
Data Management for Integrated Breeding
This tutorial will describe how to navigate the section of Gramene that provides descriptions of alleles associated with morphological, developmental,
Midwest Big Data Hub Edward Seidel Director, NCSA Founder Prof. of Physics, Prof of Astronomy On behalf of the Midwest Big Data Hub 1 Brian Athey Sarah.
Enabling Cloud and Grid Powered Image Phenotyping
© Lemyre et al., 2010 Paul Boutette, MA, B. Ed., MBA & Louise Lemyre, Ph.D. Faculty of Social Sciences, McLaughlin Research Chair on Psychosocial Risk,
Implementing a National Data Infrastructure: Opportunities for the BIO Community Peter McCartney Program Director Division of Biological Infrastructure.
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
Integration of simulation tools in online virtual worlds Stéphane SIKORA AI Lab of Paris5 University 2nd conference on.
IPSP Outcomes Reporting Framework What you need to know and what you need to do.
Introductory Phylogenetic Workflows in the Discovery Environment Sheldon McKay iPlant Collaborative, DNALC, Cold Spring Harbor Laboratory Feb 8, 2012.
A Maize Translational Research and Educational Collaborative
Data NIH Philip E. Bourne, PhD Associate Director for Data Science National Institutes of Health Big Data Symposium, Lincoln,
Big Data in Indian Agriculture D. Rama Rao Director, NAARM.
Progress on TripalBIMS Breeding Information Management System in Tripal Sook Jung, Taein Lee, Chun-Huai Chen, Jing Yu, Ksenija Gasic, Todd Campbell, Kate.
Transforming Science Through Data-driven Discovery Genomics in Education University of Delaware – February 2016 Jason Williams, Education, Outreach, Training.
Graduate Research with Bioinformatics Research Mentors Nancy Warter-Perez, ECE Robert Vellanoweth Chem and Biochem Fellow Sean Caonguyen 8/20/08.
Transforming Science Through Data-driven Discovery Workshop Overview Ohio State University MCIC Jason Williams – Lead, CyVerse – Education, Outreach, Training.
Introduction to Data Management Arllet M. Portugal Integrated Breeding Platform Breeding Management System Intensive Workshop on Data Management Jan. 26,
NeDICC meeting, 18 February 2016
Patrick S. Schnable Department of Agronomy
Functional Annotation of the Horse Genome
The Importance of “Genomes to Fields”
University of Minnesota
University of Wisconsin, Madison
Presentation transcript:

Genomes to Fields 2014 Workshop Maize Phenotypic Information Platform Carolyn Lawrence Chicago, IL December 10

Genomes to Fields Phenotype  A phenotype (from Greek phainein, 'to show' + typos, 'type') is the composite of an organism's observable characteristics or traits, such as its morphology, development, biochemical or physiological properties, phenology, behavior, and products of behavior.  A phenotype results from the expression of an organism's genes as well as the influence of environmental factors and the interactions between the two.  Phenotype is EVERYTHING 22

Genomes to Fields Tools for genotype and phenotype ~imbalanced~ Tools for studying phenotypes Tools for studying genomes Tools for studying phenotypes Tools for studying genomes Slide credit: Edgar Spalding

Genomes to Fields Why is managing ‘phenotype’ hard?  Extremely diverse data type  Associated to individuals, populations, or species  Data documented at different levels (summary -vs- measurement)  Comparative (mutant –vs- wild type) or absolute (plant height)  Different terms between disciplines (stacking a trait) 4  Data integration - needs extensive connections to other types of data (seed stocks, genes, experimental methods, environment data)  Data representation - how to represent the data in a consistent way across experiments, research groups, and organizations  Data accessibility - we must get data into others’ hands

Genomes to Fields Phenotypes are Big Data Big Data is characterized as having extreme or variable values of one or more of the following characteristics:  Volume 1 (size) Images, sequence, expression data  Velocity 1 (acquisition rate) Images, sequence  Variety 1 (structure) Data formats, alternative standards  Variability 2 (in meaning) Nomenclature, ontologies  Complexity 3 (in relationships) Mutation to genotype to phenotype…  Veracity (quality or provenance) Gold standard datasets, low quality ones  Volatility (changes over time) Versions of data 1 Doug Laney, "3-D Data Management: Controlling Data Volume, Velocity, and Variety," Brian Hopkins, "Blogging From the IBM Big Data Symposium - Big Is More Than Just Big,” Valentin T Sribar, et al., "'Big Data' Is Only the Beginning of Extreme Information Management,"

Genomes to Fields Information management food chains Plant Biology Databases: A Needs Assessment (2005) /people/faculty/fancher/FoodChain.htm

Genomes to Fields  Information must be communicated …or it is effectively lost  Results should be reproducible …or we’re not doing science 7 Our approach to information management

Genomes to Fields Balancing act: enforcing standards AND allowing flexibility  For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights Data scientists spend from 50 percent to 80 percent of their time mired in the mundane labor of collecting and preparing unruly digital data before it can be explored for useful nuggets. -New York Times 17 August 2014  For well-understood datatypes standards and adapters are required to compute across multiple datasets effectively  For emerging datatypes like high-throughput phenotyping, imposing standards could impede the development of novel and revolutionary data documentation and analysis techniques 8

Genomes to Fields How we build information platforms 9 Engage with Providers  iPlant  MaizeGDB  IBP/BMS  Private Industry  Others… Explore the Landscape Agile Development & Spiral Design

Genomes to Fields Two simultaneous activities Assembling the platform Building the ship as we sail 10

Genomes to Fields Building the ship as we sail 11  Genotype  GBS, etc.  Environment  Weather stations  GIS (soil, interpolated weather, model)  Phenotype

Genomes to Fields 12  July 25 SOUGHT COMMUNITY INPUT  Convened more than 20 researchers, data providers, and industry colleagues  Described the needs and asked for input  Followed up with many, identified IBP/BMS, CGBackOffice and MaizeGDB, and iPlant as leading groups to investigate for partnerships  AugustBEGAN DEVELOPER CALLS WITH IBP  SeptemberREQUESTED IPLANT INVOLVEMENT  November 9 – 13COORDINATED MEETING AT CIMMYT  Visited CIMMYT to meet with IBP/BMS curators (Kate Dreher, Julian Pietragalla, and Clarissa Pimentel)  Invited iPlant personnel (Ramona Walls)  Invited BMS developers from Indiana (Jan Erik Backlund) and New Zealand (Rebecca Berringer)  December 1 and 2COORDINATED MEETING AT ISU  Invited iPlant personnel to visit and describe the platform (Nicole Hopkins and Jeremy DeBarry)  Brought in G2F leaders (Pat Schnable, David Ertl) and GxE data coordinator (Jode Edwards)  Brought in ISU BMS outreach personnel (Walter Suza, Assibi Mahama)  Brought in CGBackOffice team (Ed Buckler and Cinta Romay) Assembling the platform

Genomes to Fields Assembling the platform 13 CG Back Office Buckler Lab

Genomes to Fields BMS 14  Assets:  Funded into the future  Manage all well-described datatypes now  Community investment worldwide  Initial Concerns:  Deployed on local machine  Pay model  Modules specific to PC’s  Service and collaborative development

Genomes to Fields iPlant 15  Assets:  Funded into the future  Documentation  HT image handling  Desire to make G2F a success story of their own  Initial Concerns:  Disappointment of many researchers early on  Will they invest time/effort in G2F?  Can their systems be adapted to work together?

Genomes to Fields 16

Genomes to Fields 17

Genomes to Fields iPlant 18

Genomes to Fields iPlant 19

Genomes to Fields What will make this successful?  Making data logically accessible  Working together: Beware of solitude!  Discuss the problems in diverse group settings – iteratively  Create manuals, example usage cases, and outreach materials and opportunities 20

Genomes to Fields Timeline  Training/Outreach/Feedback: Using BMS & iPlant for data access & analysis  January 26 (?) in Ames  March 12 at Maize Genetics Conference  Anticipated milestones:  December: BMS deployed on iPlant  January:  Project management and methodologies listed at iPlant via wiki  BMS, iDrop, Datastore, and Bisque for current datasets  May: Project Coordination functionality new at iPlant  June (?): CGBackOffice coming online Agile Development & Spiral Design Our goal is to enable the process, not to deploy a specific system

Genomes to Fields Acknowledgements  Lawrence Lab Jack Gardiner Darwin Campbell  GxE subgroup of G2F Jode Edwards Martin Bohn Natalia de Leon  CIMMYT, BMS, Leafnode,  MaizeGDB Carson Andorf  iPlant Nicole Hopkins Jeremy DeBarry Ramona Walls 22

Genomes to Fields Questions? 23