Introduction to Data Management Arllet M. Portugal Integrated Breeding Platform Breeding Management System Intensive Workshop on Data Management Jan. 26,

Slides:



Advertisements
Similar presentations
Geoscience Information Network Stephen M Richard Arizona Geological Survey National Geothermal Data System.
Advertisements

Bulk method Bulk is an extension of the pedigree method. In contrast to pedigree, early generations are grown as bulk populations w/o selection. The last.
IWIS Migration SDD CIMMYT, INT. December Agenda u u Introduction u u Why Migration u u The Migration Project u u The Migration Process u u The Migration.
Presentation Title Goes Here …presentation subtitle. Information Management - Phenotype Arllet M. Portugal Crop Informatics Specialist CIMMYT.
Main change from the previous discussion Adding one table to store the unit and condition of living stock for phenotyping and genotyping measurment – we.
Cis-Regulatory/ Text Mining Interface Discussion.
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
Development of the Generation Challenge Program Ontology for Crops Elizabeth Arnaud (Bioversity International) and Rosemary Shrestha (CRIL-CIMMYT), Richard.
GENOME SCANS New QTLs discovered Breeding markers Screening services Marker priorities Crossing scheme Trial MASS Socio-economics information DNA information.
Introduction to NRSP databases and other breeding databases.
Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating designs PBG 650 Advanced Plant Breeding.
Introducing NRSP10 Database Infrastructure for Specialty Crops Computer Applications in Horticulture/Teaching Methods Workshop ASHS Annual Conference 2015.
Phenotyping Clare Coyne & Melanie Harrison-Dunn, curators.
Molecular marker data and their impact on gene bank management Chris Richards NCGRP, Fort Collins, CO Curator Workshop, Atlanta Georgia.
FP WIKT '081 Marek Skokan, Ján Hreňo Semantic integration of governmental services in the Access-eGov project Faculty of Economics.
Graham McLaren and Arllet M. Portugal Generation Challenge Program
GENERATION CHALLENGE PROGRAMME (GCP) EXTERNAL REVIEW INDEPENDENT EVALUATION ARRANGEMENT (IEA) OF THE CGIAR.
Biodiversity research and informatics in Bioversity International TDWG 2009 meeting ‘e-knowledge about Biodiversity and Agriculture’ Montpellier, 9-13.
Update on Capacity Building through IBP IBP annual meeting, June 2011 Wageningen.
Crop Ontology towards the semantic integration of open plant trait data Elizabeth Arnaud, Luca Matteis, Rosemary Shrestha, Milko Skofic,
NRSP10 Database Resources for Crop Genomics, Genetics and Breeding Research NRSP Crops Breeders Database Needs Focus Group Meeting July 30, 2015 Pullman,
Data Management for Integrated Breeding
Graham McLaren GCP21-II, Kampala, Uganda 19 June 2012.
Presentation Title Goes Here …presentation subtitle. International Crop Information System : Its Development and Rice & Wheat Implementation Arllet M.
Phenotype Curation Susan R. McCouch Department of Plant Breeding Cornell University.
The BSF project: Multi-country projects lead by Indonesia.
Fred Okono March 2012 Integrated Breeding Platform Web Portal.
Molecular Breeding Platform Relationship with ICIS Graham McLaren ICIS Developers’ Workshop March 2nd 2010, Perth, Australia.
An initiative of the CGIAR Generation Challenge Programme (GCP) Breeding Management System Overview of functionalities Photo credit: Isagani Serrano/IRRI.
Progress on TripalBIMS Breeding Information Management System in Tripal Sook Jung, Taein Lee, Chun-Huai Chen, Jing Yu, Ksenija Gasic, Todd Campbell, Kate.
Data Input Component of CropGen International Consultancy for GCP Robert Koebner PhD Paul Brennan MAgrSC, PhD Consultants in Plant Breeding, Application.
The Cassava Trait Dictionary GCP – Integrated Breeding Platform Fernando Rojas GCP- Consultant.
Data Flows in Integrated Breeding Graham McLaren IBP Annual Meeting 1 st -3 rd June 2011 Wageningen.
1 CfE Higher Biology 3.2(a,b,c) Plant and animal breeding.
Session: Towards systematically curating and integrating
Diversity Seek (DivSeek)
Slides Template for Module 3 Contextual details needed to make data meaningful to others CC BY-NC.
GDR Workshop Tuesday 21st, 2016 RGC8 2016
Graham McLaren GCP21-II, Kampala, Uganda 19 June 2012
E-PORTOFOLIO PROCESS IN VOCATIONAL EDUCATION
Breeding Information Management System
Summit 2017 Breakout Group 2: Data Management (DM)
Introduction to Statistics
The International Plant Protection Convention
Civil Registration Process: Place, Time, Cost, Late Registration
9. Introduction to signal detection
PRINCIPLES OF CROP PRODUCTION ABT-320 (3 CREDIT HOURS)
PRINCIPLES OF CROP PRODUCTION ABT-320 (3 CREDIT HOURS)
Social Knowledge Mining
The Importance of “Genomes to Fields”
Introduction to D4Science
Boyce Thompson Institute
Welcome to the Gene and Allele Database Tutorial
An ecosystem of contributions
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
1-What matooke delivered the NARITAs?
Understanding Multi-Environment Trials
for the Cotton Community
Learning Targets for Introduction to Physics
Overview of Approaches to Register-Based Populating Censuses
University of Wisconsin, Madison
Bird of Feather Session
TOPMed Analysis Workshop Genetic Analysis Center Biostatistics Department University of Washington TOPMed Data Coordinating Center August 7-9, 2017 Introduction.
11.1 The Work of Gregor Mendel
BIMS (Breeding Information Management System)
V. Kyaligonza, R. Kawuki, M. Ferguson, Y. Baguma, T. Kaweesi , J
CottonGen: Enabling Cotton Research through Big-Data Analysis and Integration Jing Yu, Sook Jung, Chun-Huai Cheng, Taein Lee, Katheryn Buble, Ping Zheng,
G061 - Data Dictionary.
Draft revision of ISPM 6: National surveillance systems ( )
Module 1.1 Overview of Master Facility Lists in Nigeria
Presentation transcript:

Introduction to Data Management Arllet M. Portugal Integrated Breeding Platform Breeding Management System Intensive Workshop on Data Management Jan. 26, 2015 Rice Gene Discovery Unit, Kasetsart University, Kamphangsaen

Importance of Data Management Data that are properly managed are:

Importance of Data Management Data that are properly managed are: = Shareable more accessible to research partners national and global sharing & linking of data

Importance of Data Management Data that are properly managed are: = Available enables reliable analysis & conclusions leads to better science & more sophisticated research

Importance of Data Management Data that are properly managed are: = Re-usable more likely to be used again for different purposes

Importance of Data Management Data that are properly managed are:  important for historically significant data  scientific method changed to: hypothesize, then look up answer in database = Preservable

Importance of Data Management Data that are properly managed are: areable vailable e-usable reservable

3 Types of Data 1) Genealogy or pedigree data parents unique identification of germplasm through Germplasm IDs (GID) names

2) Phenotypic data 3 Types of Data observable characteristics or traits environmental data across studies are linked via controlled trait vocabularies / standard terms

3) Genotypic data 3 Types of Data  usually with reference to a specific trait under consideration genetic composition

Principles of DM for Integrated Breeding (IB) IB requires high standards of sample and pedigree identification, It requires integration of field and lab data, and quality is of paramount importance. Data collected during breeding processes has immediate value for breeders and it also has cumulative value over years and populations.

The Crop Databases

The Crop Databases …  The germplasm has a unique identifier which is the key link among the different data in the database  The evaluation data will allow across study query or time-series analysis

Curation of Germplasm Information  The germplasm is described by the following:  The breeding method on how the germplasm was developed  The parents of the germplasm if it is developed by generative method such as single cross  The cross and the source if it is developed by derivative method such as single plant selection  The date and location of its development  The different names given to it with one as the preferred name

Curation of Germplasm Information …  Protocol should be established in naming Nursery or Trial list e.g. RYT2012W The cross e.g. CAAS 1001 The selection or population line e.g. - CAAS

Curation of Evaluation Data  Describe the study or trial  Provide clear specification of the traits to be measured which includes the measurement unit and the method of measurement  Provide the field design

Description of trait

Trait Dictionary  A trait dictionary contains a list of traits with full description about them  GCP established trait dictionaries for ten crops which consist of traits regularly measured in a breeding program  They are established through a set of processes which involves validation of the initial list of traits, ranking of the traits, analysis of the ranking and complete documentation of the important traits for breeding. This is done in consultation with 5 to 10 breeders within the crop community.

Crop Ontology  The Crop Ontology provides validated trait names used by the crop communities of practice for harmonizing the annotation of phenotypic and genotypic data and thus supporting data accessibility and discovery through web queries.  An important feature is the cross referencing of CO terms with the Crop database trait ID and with their synonyms in Plant Ontology and Trait Ontology. Web links between cross referenced terms in CO provide online access to data annotated with similar ontological terms,  The established Trait Dictionaries are uploaded in the Crop Ontology

Goal Use BMS in keeping data namely germplasm and phenotype data (nursery, trial) of several crops (rice, cassava, maize, vegetables, etc.) that were generated in breeding programs and possibility of establishing a central database at station and institute levels.

Objectives To provide training and support as part of the capacity building component of IBP for a broader adoption and use of BMS in the overall breeding programme of NARS and companies working on plant breeding. To use actual data in training to identify specific component(s) of IBP that needs customization to properly accommodate data specific to a crop generated in breeding program