Computational Challenges in Whole-Genome Association Studies Ion Mandoiu Computer Science and Engineering Department University of Connecticut.

Slides:



Advertisements
Similar presentations
Analysis of imputed rare variants
Advertisements

Marius Nicolae Computer Science and Engineering Department
Combinatorial Algorithms for Haplotype Inference Pure Parsimony Dan Gusfield.
Genetic Analysis in Human Disease
Multiple Comparisons Measures of LD Jess Paulus, ScD January 29, 2013.
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Objectives Cover some of the essential concepts for GWAS that have not yet been covered Hardy-Weinberg equilibrium Meta-analysis SNP Imputation Review.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
University of Connecticut
1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.
The role of variation in finding functional genetic elements Andy Clark – Cornell Dave Begun – UC Davis.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
Ion Mandoiu Computer Science and Engineering Department
A coalescent computational platform for tagging marker selection for clinical studies Gabor T. Marth Department of Biology, Boston College
Genotype Error Detection using Hidden Markov Models of Haplotype Diversity Ion Mandoiu CSE Department, University of Connecticut Joint work with Justin.
The Extraction of Single Nucleotide Polymorphisms and the Use of Current Sequencing Tools Stephen Tetreault Department of Mathematics and Computer Science.
Combinatorial Algorithms for Maximum Likelihood Tag SNP Selection and Haplotype Inference Ion Mandoiu University of Connecticut CS&E Department.
ISBRA 2007 Tutorial A: Scalable Algorithms for Genotype and Haplotype Analysis Ion Mandoiu (University of Connecticut) Alexander Zelikovsky (Georgia State.
Genotype Error Detection using Hidden Markov Models of Haplotype Diversity Justin Kennedy, Ion Mandoiu, Bogdan Pasaniuc CSE Department, University of Connecticut.
Optimal Tag SNP Selection for Haplotype Reconstruction Jin Jun and Ion Mandoiu Computer Science & Engineering Department University of Connecticut.
Marius Nicolae Computer Science and Engineering Department University of Connecticut Joint work with Serghei Mangul, Ion Mandoiu and Alex Zelikovsky.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
Imputation-based local ancestry inference in admixed populations Ion Mandoiu Computer Science and Engineering Department University of Connecticut Joint.
Inference of Genealogies for Recombinant SNP Sequences in Populations Yufeng Wu Computer Science and Engineering Department University of Connecticut
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Towards accurate detection and genotyping of expressed variants from whole transcriptome sequencing data Jorge Duitama 1, Pramod Srivastava 2, and Ion.
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Simple Nucleotide.
Understanding Genetics of Schizophrenia
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Linear Reduction for Haplotype Inference Alex Zelikovsky joint work with Jingwu He WABI 2004.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
Case(Control)-Free Multi-SNP Combinations in Case-Control Studies Dumitru Brinza and Alexander Zelikovsky Combinatorial Search (CS) for Disease-Association:
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
National Taiwan University Department of Computer Science and Information Engineering Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
The Complexities of Data Analysis in Human Genetics Marylyn DeRiggi Ritchie, Ph.D. Center for Human Genetics Research Vanderbilt University Nashville,
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
Conservation of genomic segments (haplotypes): The “HapMap” n In populations, it appears the the linear order of alleles (“haplotype”) is conserved in.
CS177 Lecture 10 SNPs and Human Genetic Variation
Genome-Wide Association Study (GWAS)
National Taiwan University Department of Computer Science and Information Engineering Pattern Identification in a Haplotype Block * Kun-Mao Chao Department.
BGRS 2006 SEARCH FOR MULTI-SNP DISEASE ASSOCIATION D. Brinza, A. Perelygin, M. Brinton and A. Zelikovsky Georgia State University, Atlanta, GA, USA 123.
Whole genome association studies Introduction and practical Boulder, March 2009.
10cM - Linkage Mapping Set v2 ABI Median intermarker distance: 4.7 Mb Mean intermarker distance: 5.6 Mb Mean genetic gap distance: 8.9 cM Average Heterozygosity.
Jianfeng Xu, M.D., Dr.PH Professor of Public Health and Cancer Biology Director, Program for Genetic and Molecular Epidemiology of Cancer Associate Director,
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
Lab 13: Association Genetics December 5, Goals Use Mixed Models and General Linear Models to determine genetic associations. Understand the effect.
Future Directions Pak Sham, HKU Boulder Genetics of Complex Traits Quantitative GeneticsGene Mapping Functional Genomics.
Linear Reduction Method for Tag SNPs Selection Jingwu He Alex Zelikovsky.
The International Consortium. The International HapMap Project.
Imputation-based local ancestry inference in admixed populations
Biostatistics-Lecture 19 Linkage Disequilibrium and SNP detection
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Variant calling: number of individuals vs. depth of read coverage Gabor T. Marth Boston College Biology Department 1000 Genomes Meeting Cold Spring Harbor.
Lectures 7 – Oct 19, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
The analysis of A Genome-wide Association Study of Autism Reveals a Common Novel Risk Locus at 5p14.1 Rodney Knowlton Kyle Andrews.
Analysis of Next Generation Sequence Data BIOST /06/2015.
NCSU Summer Institute of Statistical Genetics, Raleigh 2004: Genome Science Session 3: Genomic Variation.
Genome-Wides Association Studies (GWAS) Veryan Codd.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Gil McVean Department of Statistics
Constrained Hidden Markov Models for Population-based Haplotyping
Imputation-based local ancestry inference in admixed populations
Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.
Error Checking for Linkage Analyses
Medical genomics BI420 Department of Biology, Boston College
Perspectives from Human Studies and Low Density Chip
Medical genomics BI420 Department of Biology, Boston College
Presentation transcript:

Computational Challenges in Whole-Genome Association Studies Ion Mandoiu Computer Science and Engineering Department University of Connecticut

Approaches to Disease Gene Mapping Linkage analysis LOD:=log 10 (L(  )/L(1/2)) Very successful for Mendelian diseases (cystic fibrosis, Huntington’s,…) Low power to detect genes with small relative risk in complex diseases [RischMerikangas’96] CasesControls Association analysis  2 -test Genome-wide scans made possible by recent progress in SNP genotyping technologies

3 Computational Challenges Detecting genotyping errors Imputation of missing genotypes Imputation of untyped genotypes based on reference population (e.g., Hapmap) Haplotype inference and haplotype-based association tests Modeling gene-gene interactions Handling structural variation data provided by new sequencing technologies Optimal multi-stage study design

Genotype Error Detection A real problem despite advances in technology In [KMP07] we proposed efficient methods for error detection in trio data based on LLR approach combined with an HMM model of haplotype diversity In ongoing work we seek to improve error detection accuracy by using low-level data such as typing confidence scores

Genotype Imputation Current genotyping platforms cover <1 mil. SNPs of ~10mil. SNPs  causal variant unlikely to be assayed directly Untyped SNPs can be imputed based on linkage disequilibrium info inferred from high-density datasets such as Hapmap Maximum likelihood approach: probabilities computed using HMM Allele frequency, imputed genotypes Allele frequency, typed genotypes

Acknowledgements & Advertisment Justin Kennedy, Bogdan Pasaniuc NSF funding (Awards and ) DIMACS Workshop on Computational Issues in Genetic Epidemiology August , 2008 DIMACS Center, CoRE Building, Rutgers University Presented under the auspices of the DIMACS/BioMaPS/MB Center Special Focus on Information Processing in Biology.DIMACS/BioMaPS/MB Center Special Focus on Information Processing in Biology Organizers: Andrew Scott Allen, Duke University, Ion Mandoiu, University of Connecticut Dan Nicolae, University of Chicago, Yi Pan, Georgia State University, Alex Zelikovsky, Georgia State University