Stratification Lon Cardon University of Oxford

Slides:



Advertisements
Similar presentations
A quantitative trait locus not associated with cognitive ability in children: a failure to replicate Hill, L. et al.
Advertisements

SADC Course in Statistics To the Woods discussion (Sessions 10)
Confounding from Cryptic Relatedness in Association Studies Benjamin F. Voight (work jointly with JK Pritchard)
Generalized Regional Admixture Mapping (RAM) and Structured Association Testing (SAT) David T. Redden, Associate Professor, Department of Biostatistics,
Planning breeding programs for impact
Association Tests for Rare Variants Using Sequence Data
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Human Genetics Genetic Epidemiology.
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
The Inheritance of Complex Traits
Analysis of whole genome association studies in pedigreed populations
Admixture Mapping Qunyuan Zhang Division of Statistical Genomics GEMS Course M Computational Statistical Genetics Computational Statistical Genetics.
Quantitative Genetics
Positional Cloning LOD Sib pairs Chromosome Region Association Study Genetics Genomics Physical Mapping/ Sequencing Candidate Gene Selection/ Polymorphism.
A new sampling method: stratified sampling
BIO341 Meiotic mapping of whole genomes (methods for simultaneously evaluating linkage relationships among large numbers of loci)
Understanding sample survey data
Statistical Power Calculations Boulder, 2007 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003
Chuanyu Sun Paul VanRaden National Association of Animal breeders, USA Animal Improvement Programs Laboratory, USA Increasing long term response by selecting.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
880.P20 Winter 2006 Richard Kass Propagation of Errors Suppose we measure the branching fraction BR(Higgs  +  - ) using the number of produced Higgs.
ConceptS and Connections
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Lecture 6 Forestry 3218 Forest Mensuration II Lecture 6 Double Sampling Cluster Sampling Sampling for Discrete Variables Avery and Burkhart, Chapter 3.
Linkage in selected samples Manuel Ferreira QIMR Boulder Advanced Course 2005.
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School.
Type 1 Error and Power Calculation for Association Analysis Pak Sham & Shaun Purcell Advanced Workshop Boulder, CO, 2005.
Quantitative Genetics
Regression-Based Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
J. B. Cole * and P. M. VanRaden Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD
Sampling Design in Regional Fine Mapping of a Quantitative Trait Shelley B. Bull, Lunenfeld-Tanenbaum Research Institute, & Dalla Lana School of Public.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Association between genotype and phenotype
C2BAT: Using the same data set for screening and testing. A testing strategy for genome-wide association studies in case/control design Matt McQueen, Jessica.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Genetic background and population stratification Shaun Purcell 1,2 & Pak Sham 1 1 Social, Genetic & Developmental Psychiatry Research Centre, IoP, KCL,
Lecture 22: Quantitative Traits II
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Family Based Association Danielle Posthuma Stacey Cherny TC18-Boulder 2005.
Powerful Regression-based Quantitative Trait Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
24.1 Quantitative Characteristics Vary Continuously and Many Are Influenced by Alleles at Multiple Loci The Relationship Between Genotype and Phenotype.
A simple method to localise pleiotropic QTL using univariate linkage analyses of correlated traits Manuel Ferreira Peter Visscher Nick Martin David Duffy.
Association Mapping in Families Gonçalo Abecasis University of Oxford.
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Date of download: 7/2/2016 Copyright © 2016 American Medical Association. All rights reserved. From: How to Interpret a Genome-wide Association Study JAMA.
Data Science Credibility: Evaluating What’s Been Learned
University of Colorado at Boulder
Ø Novel approaches for linkage mapping in dairy cattle
CHAPTER 4 Designing Studies
Marker heritability Biases, confounding factors, current methods, and best practices Luke Evans, Matthew Keller.
The Genetic Basis of Complex Inheritance
Genetics of qualitative and quantitative phenotypes
Power to detect QTL Association
The ‘V’ in the Tajima D equation is:
Genome-wide Association Studies
Linking Genetic Variation to Important Phenotypes
Complex Traits Qualitative traits. Discrete phenotypes with direct Mendelian relationship to genotype. e.g. black or white, tall or short, sick or healthy.
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Association Mapping Lon Cardon
Why general modeling framework?
There is a Great Diversity of Organisms
Statistical Analysis and Design of Experiments for Large Data Sets
Association Analysis Spotted history
Lecture 9: QTL Mapping II: Outbred Populations
Chapter 7 Beyond alleles: Quantitative Genetics
The Power of Genomic Control
Presentation transcript:

Stratification Lon Cardon University of Oxford F:\lon\2001\stratification\stratification.ppt

Population Stratification Consider trait distribution only Mean differences in population substrata These differences alone do not influence genetic association under Ho

Single sample, unequal marker allele frequencies f(‘2’) > f(‘1’)

Single sample, unequal marker allele frequencies f(‘1’) > f(‘2’)

Stratified sample, equal marker allele frequencies f(‘1’) = f(‘2’) More ‘1’ in high end More ‘2’ in low end. Association evidence

A Simple Model of Stratification Consider: Two subsamples of equal sample size, but opposite allele frequencies (e.g., sample 1, 52:48; sample 2, 48:52); Within sample variance of the usual form of Va + Vq(w) + Ve, which is the same for both subsamples; Mean effects arising from a true QTL and stratification: m= mq + ms Then: Total variance in the combined sample has additional ‘between-strata’ effects due to the QTL and stratification: Vq(b) and Vs, so Vq = Vq(b) + Vq(w) and the total variance is Va + Vq + Vs + Ve Let Vq = Vs = 0.05 and p1, p2 vary from .1 to .9.

Stratification only

QTL effect only

QTL and stratification effects

Stratification Summary Stratification not only yields increases in Type I error Can also mask real effects Could see ‘true’ case/control results but no TDT Difficult area of research.

Stratification detection/correction using the Genome Idea: Don’t necessarily need to use family-based controls to detect/control for stratification, can use other markers in ‘cases’ Pritchard & Rosenberg (1999). Am J Hum Genet Procedure: Interested in candidate marker, C1, genotype ~ 40 other anonymous (unlinked) markers, M1 .. M40. Calculate association c2 for M1 .. M40. Test is on sum of c2. If find evidence in background, worry about stratification; else, do not. Extensions: Use same idea to gain estimate of background ‘inflation factor’ of test statistic. Use this factor to correct candidate gene test. Pritchard et al. (2000) Am J Hum Genet Devlin & Roeder (1999) Biometrics (‘Genomic Control’) Bacanu, Devlin & Roeder (2000) Am J Hum Genet.

How bad is the stratification problem?

When is Stratification Tricky?

Stratification Detection Using the Genome Promising idea to allow large studies of popln cohorts Appears to detect large stratification differences easily Small frequency differences much more difficult to detect. Can still obtain large (> 2-fold) increases in Type I error rate. Unfortunately, these differences may be precisely what we seek in complex traits Tough cases: many sub-strata, uninformative markers, effects of linked background markers. Watch this area: very active