................... GM01 GM07 0.36 GM01 GM08 0.40 GM01 GM09 0.48 GM01 GM10 0.52 GM01 GM11 0.60 GM01 GM12 0.68 GM02 GM01 0.04 GM02 GM02 0.00 GM02 GM03 0.08.

Slides:



Advertisements
Similar presentations
Linkage and Genetic Mapping
Advertisements

Planning breeding programs for impact
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
ASSOCIATION MAPPING WITH TASSEL Presenter: VG SHOBHANA PhD Student CPMB.
Believing in MAGIC: Validation of a novel experimental breeding design Emma Huang, Ph.D. Biometrics on the Lake December 2, 2009.
METHODS FOR HAPLOTYPE RECONSTRUCTION
Linkage and Gene Mapping. Mendel’s Laws: Chromosomes Locus = physical location of a gene on a chromosome Homologous pairs of chromosomes often contain.
Basics of Linkage Analysis
Lettuce genetic map viewer is written in PHP and uses GD library. The viewer interacts with tables in the relational mySQL database and creates graphical.
QTL Mapping R. M. Sundaram.
GenomePixelizer - a visualization tool for comparative genomics within and between species. A. Kozik, E. Kochetkova, and R. Michelmore (Department of Vegetable.
Association Modeling With iPlant
1.Generate mutants by mutagenesis of seeds Use a genetic background with lots of known polymorphisms compared to other genotypes. Availability of polymorphic.
31 January, 2 February, 2005 Chapter 6 Genetic Recombination in Eukaryotes Linkage and genetic diversity.
A Genomic Survey of Polymorphism and Linkage Disequilibrium Imran Mohiuddin Magnus Nordborg, Ph.D. University of Southern California.
Mapping Basics MUPGRET Workshop June 18, Randomly Intermated P1 x P2  F1  SELF F …… One seed from each used for next generation.
; ; | | | | GM01 A A A A A A A A A A A A A A A A B B B B B B B B B GM02 A A A A A A A A A A A A A A A B B B B B B B B B B GM03 A A A A A A A.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Office hours Wednesday 3-4pm 304A Stanley Hall Review session 5pm Thursday, Dec. 11 GPB100.
Polymorphism and Variant Analysis Lab
Metagenomic Analysis Using MEGAN4
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Creating a Kinship Matrix using Microsatellite Analyzer (MSA) Zhifen Zhang The Ohio State University.
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
A hierarchical approach to building contig scaffolds Mihai Pop Dan Kosack Steven L. Salzberg Genome Research 14(1), pp , 2004.
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Computational Biology, Part E Basic Principles of Computer Graphics Robert F. Murphy Copyright  1996, 1999, 2000, All rights reserved.
RNAseq analyses -- methods
DAY 1. GENERAL ASPECTS FOR GENETIC MAP CONSTRUCTION SANGREA SHIM.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
Regulation of gene expression in the mammalian eye and its relevance to eye disease Todd Scheetz et al. Presented by John MC Ma.
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
Supplemental Figure 1A. A small fraction of genes were mapped to >=20 SNPs. Supplemental Figure 1B. The density of distance from the position of an associated.
Announcements: Proposal resubmission deadline 4/23 (Thursday).
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
Linkage and Mapping. Figure 4-8 For linked genes, recombinant frequencies are less than 50 percent.
Grouping loci Criteria Maximum two-point recombination fraction –Example -r ij ≤ 0.40 Minimum LOD score - Z ij –For n loci, there are n(n-1)/2 possible.
Lettuce/Sunflower EST CGPDB project. Data analysis, assembly visualization and validation. Alexander Kozik, Brian Chan, Richard Michelmore. Department.
Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.
QTL Cartographer A Program Package for finding Quantitative Trait Loci C. J. Basten Z.-B. Zeng and B. S. Weir.
Analyzing digital gene expression data in Galaxy Supervisors: Peter-Bram A.C. ’t Hoen Kostas Karasavvas Students: Ilya Kurochkin Ivan Rusinov.
The International Consortium. The International HapMap Project.
A Parallel, High Performance Implementation of the Dot Plot Algorithm Chris Mueller July 8, 2004.
The genomes of recombinant inbred lines
Pedagogical Objectives Bioinformatics/Neuroinformatics Unit Review of genetics Review/introduction of statistical analyses and concepts Introduce QTL.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Genetic correlations and associative networks for CNS transcript abundance and neurobehavioral phenotypes in a recombinant inbred mapping panel Elissa.
Lloyd Algorithm K-Means Clustering. Gene Expression Susumu Ohno: whole genome duplications The expression of genes can be measured over time. Identifying.
Review of statistical modeling and probability theory Alan Moses ML4bio.
Types of genome maps Physical – based on bp Genetic/ linkage – based on recombination from Thomas Hunt Morgan's 1916 ''A Critique of the Theory of Evolution'',
High resolution QTL mapping in genotypically selected samples from experimental crosses Selective mapping (Fig. 1) is an experimental design strategy for.
Genetic mapping and QTL analysis - JoinMap and QTLNetwork -
Finding a gene based on phenotype Model organisms ’s of DNA markers mapped onto each chromosome – high density linkage map. 2. identify markers linked.
1 Bioinformatics Tools for Genotyping Frances Tong Dr. Garry Larson, Ph.D City of Hope Department of Molecular Medicine Southern California Bioinformatics.
High-throughput genomic profiling of tumor-infiltrating leukocytes
Bioinformatics Overview
Overview Modern chip designs have multiple IP components with different process, voltage, temperature sensitivities Optimizing mix to different customer.
How to Make a Genetic Map
Gonçalo Abecasis and Janis Wigginton University of Michigan, Ann Arbor
University of Tennessee-Memphis
Lettuce/Sunflower EST CGPDB project.
Map-based cloning of interesting genes
Mapping Quantitative Trait Loci
Michael Cullen, Stephen P
Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium  Christopher S. Carlson,
Volume 26, Issue 23, Pages (December 2016)
Alexander Kozik and Richard Michelmore, UC Davis Genome Center
Volume 9, Issue 8, Pages (August 2016)
Presentation transcript:

GM01 GM GM01 GM GM01 GM GM01 GM GM01 GM GM01 GM GM02 GM GM02 GM GM02 GM GM02 GM GM02 GM GM02 GM MadMapper And CheckMatrix: Python Scripts To Infer Orders Of Genetic Markers And For Visualization And Validation Of Genetic Maps And Haplotypes. Alexander Kozik and Richard Michelmore. The Genome Center, University of California Davis, CA Contemporary molecular marker techniques can generate mapping data for thousands molecular markers simultaneously. Construction and validation of high density genetic maps is a challenge and requires robust, high-throughput approaches. As part of the Compositae Genome Project, we developed a suite of Python scripts for quality control of genetic markers, grouping and inference of linear order of markers in linkage groups. These scripts can be used in conjunction with other mapping programs or can be used as a stand-alone package. The suite consists of three programs: MadMapper_RECBIT, MadMapper_XDELTA and CheckMatrix. MadMapper_RECBIT analyses raw marker scores for recombinant inbred lines. MadMapper_RECBIT generates pairwise distance scores for all markers, clusters based on pairwise distances, identifies genetic bins, assigns new markers to known linkage groups, validates allele calls, and assigns quality classes to each marker based on several criteria and cutoff values. MadMapper_XDELTA utilizes a new algorithm, Minimum Entropy Approach and Best-Fit Extension, to infer linear order of markers. MadMapper_XDELTA analyzes two-dimensional matrices of all pairwise scores and finds best map that has minimal total sum of differences between adjacent cells (map with lowest entropy). This approach scales well and can accommodate large numbers of markers, unlike some commonly used mapping programs. CheckMatrix serves as a visualization tool to validate constructed genetic maps. CheckMatrix generates graphical genotypes and two-dimensional heat plots of pairwise scores. Visualization of regions with positive and negative linkage as well as of allele fraction per marker simplifies genetic map validation without applying statistical approaches. Scripts are freely available at BRIEF DESCRIPTION OF RIL MAPPING PIPELINE: 1. Processing of raw markers scores and grouping: MadMapper_RECBIT generates multiple text files for further analysis 2. Construction of genetic map (ordering of markers) per linkage group: MadMapper_XDELTA (or any other mapping program) 3. Visualization and validation of genetic maps: CheckMatrix generates heat plots of recombination scores and graphical genotyping MadMapper and CheckMatrix are Python scripts and can be used on any computer platform: UNIX, Windows, Mac OS-X. Grouping can be done on a set of ~2,000 markers; map construction works in reasonable timeframe with up to ~500 markers MadMapper_XDELTAJoinMapRecord physical coordinates of markers on Arabidopsis genome inferred order of markers by three different approaches (mapping programs) Side-by-side comparison of linear order of markers on Arabidopsis genome inferred by three different approaches (mapping programs) and comparison with physical order of markers (Col- 0 genomic sequence): MadMapper_XDELTA (minimum entropy approach), JoinMap (maximum likelihood) and RECORD (minimum number of recombination events) [Diagonal dot-plot was created using GenoPix_2D_Plotter] regions with negative linkage regions with quasi linkage main diagonal with linked markers 2-D diagonal ChekMatrix heat-plot: all markers versus all markers [color gradient reflects linkage scores between markers] Linkage group I Linkage group II Linkage group III Linkage group IV Linkage group V Linkage group ILinkage group IILinkage group IIILinkage group IVLinkage group V CheckMatrix graphical genotyping Haplotypes per RIL (inbred line) [ red – Columbia; blue – L.erecta ] LINEAR ORDER OF MARKERS INFERRED BY THREE DIFFERENT METHODS: REFERENCES AND DATA SOURCES: 1. Dean and Lister Arabidopsis Genetic Map and Raw Data: 2. MadMapper: 3. JoinMap: 4. RECORD: 5. GenoPix_2D_Plotter CREDITS: This work was funded by NSF grant # to Compositae Genome Consortium PAG-14 POSTERS WITH EXAMPLES OF MADMAPPER USAGE: #P751 High-Density Haplotyping With Microarray-Based Single Feature Polymorphism Markers In Arabidopsis #P761 Gene Expression Markers: Using Transcript Levels Obtained From Microarrays To Genotype A Segregating Population allele composition per markers MINIMUM ENTROPY APPROACH TO INFER LINEAR ORDER OF MARKERS: CheckMatrix 2D plot: random order high ‘entropy’ partially wrong order right order low ‘entropy’ Example of group analysis by MadMapper_RECBIT grouping cutoff stringency distinct linkage group #4 MadMapper_XDELTA analyzes two- dimensional matrices of all pairwise scores and finds best map that has minimal total sum of differences between adjacent cells (map with lowest ‘entropy’). Two-dimensional matrix of recombination pairwise scores CheckMatrix Color Scheme adjacent cells (values) Numerical data generated by MadMapper Visualization of numerical data using ChekMatrix Linkage group I Linkage group II Linkage group III Linkage group IV Linkage group V VISUALIZATION OF ARABIDOPSIS GENETIC MAP (DEAN AND LISTER, ) USING CHECKMATRIX [ MAP WAS RE-CONSTRUCTED USING MADMAPPER ] high density of markers low density of markers MadMapperJoinMapRECORD CHECKMATRIX USAGE: Three input files are required: LG GM01 0 LG GM02 1 LG GM03 2 LG GM04 3 LG GM05 4 LG GM06 5 LG GM07 6 LG GM08 7 LG GM09 8 LG GM10 9 LG GM11 10 LG GM12 11 Map file Matrix file ; ; | | | | GM01 A A A A A A A A A A A A A A A A B B B B B B B B B GM02 A A A A A A A A A A A A A A A B B B B B B B B B B GM03 A A A A A A A A A A A A A B B B B B B B B B B B B GM04 A A A A A A A A A A A B B B B B B B B B B B B B B GM05 A A A A A A A A A A B B B B B B B B B B B B B B B GM06 A A A A A A A A A B B B B B B B B B B B B B B B B GM07 A A A A A A A A A B B B B B B B B B B B B B B A A GM08 A A A A A A A A A B B B B B B B B B B B B B A A A GM09 A A A A A A A A A B B B B B B B B B B B A A A A A GM10 B A A A A A A A A A B B B B B B B B B A A A A A A GM11 B B A A A A A A A A B B B B B B B B A A A A A A A GM12 B B B A A A A A A A B B B B B B B A A A A A A A A Locus file CheckMatrix Upon program execution three output files will be generated: HEAT PLOT – it assists to validate the quality of constructed genetic map and identify markers with wrong position GRAPHICAL GENOTYPING: visualization of haplotypes per recombinant line (suspicious double crossovers are highlighted) 1 2 CIRCULAR GRAPH – it assists to validate genetic map and identify markers with spurious linkage 3