ANALYSIS OF SELECTIVE DNA POOLING DATA IN FOX Joanna Szyda, Magdalena Zatoń-Dobrowolska, Heliodor Wierzbicki, Anna Rząsa.

Slides:



Advertisements
Similar presentations
Chapter 2 Describing Contingency Tables Reported by Liu Qi.
Advertisements

How Should We Assess the Fit of Rasch-Type Models? Approximating the Power of Goodness-of-fit Statistics in Categorical Data Analysis Alberto Maydeu-Olivares.
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Chapter Eighteen MEASURES OF ASSOCIATION
Final Review Session.
Chapter 14 Conducting & Reading Research Baumgartner et al Chapter 14 Inferential Data Analysis.
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
Log-linear and logistic models
Chapter 11 Multiple Regression.
Bayesian Analysis of Dose-Response Calibration Curves Bahman Shafii William J. Price Statistical Programs College of Agricultural and Life Sciences University.
Population Proportion The fraction of values in a population which have a specific attribute p = Population proportion X = Number of items having the attribute.
1 Categorical Data (Chapter 10) Inference about one population proportion (§10.2). Inference about two population proportions (§10.3). Chi-square goodness-of-fit.
Leedy and Ormrod Ch. 11 Gray Ch. 14
AS 737 Categorical Data Analysis For Multivariate
Fundamentals of Data Analysis Lecture 7 ANOVA. Program for today F Analysis of variance; F One factor design; F Many factors design; F Latin square scheme.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
The Triangle of Statistical Inference: Likelihoood
1 Rob Woodruff Battelle Memorial Institute, Health & Analytics Cynthia Ferre Centers for Disease Control and Prevention Conditional.
POTH 612A Quantitative Analysis Dr. Nancy Mayo. © Nancy E. Mayo A Framework for Asking Questions Population Exposure (Level 1) Comparison Level 2 OutcomeTimePECOT.
Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.
CS433: Modeling and Simulation Dr. Anis Koubâa Al-Imam Mohammad bin Saud University 15 October 2010 Lecture 05: Statistical Analysis Tools.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Quantitative Genetics
Chi- square test x 2. Chi Square test Symbolized by Greek x 2 pronounced “Ki square” A Test of STATISTICAL SIGNIFICANCE for TABLE data.
Essential Statistics Chapter 161 Review Part III_A_Chi Z-procedure Vs t-procedure.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be qualitative or quantitative If Quantitative, we typically assume.
1 Follow the three R’s: Respect for self, Respect for others and Responsibility for all your actions.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
Making Comparisons All hypothesis testing follows a common logic of comparison Null hypothesis and alternative hypothesis – mutually exclusive – exhaustive.
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Confidence Intervals for a Population Proportion Excel.
Section 6.4 Inferences for Variances. Chi-square probability densities.
Chapter 9 Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis.
Powerful Regression-based Quantitative Trait Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Nonparametric Statistics
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Comparing Models.
Logistic Regression: Regression with a Binary Dependent Variable.
Chi-square test.
Nonparametric Statistics
Logistic Regression APKC – STATS AFAC (2016).
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Part Three. Data Analysis
Chapter 5 Introduction to Factorial Designs
SA3202 Statistical Methods for Social Sciences
Nonparametric Statistics
Statistical Process Control
Categorical Data Analysis
Detecting variance-controlling QTL
Introduction to log-linear models
QTL Fine Mapping by Measuring and Testing for Hardy-Weinberg and Linkage Disequilibrium at a Series of Linked Marker Loci in Extreme Samples of Populations 
Power Calculation for QTL Association
Forest plot of adjusted odds ratios (with 95% confidence intervals) from multivariable multinomial logistic regression analysis (table 4), by number of.
Statistical Inference for the Mean: t-test
Presentation transcript:

ANALYSIS OF SELECTIVE DNA POOLING DATA IN FOX Joanna Szyda, Magdalena Zatoń-Dobrowolska, Heliodor Wierzbicki, Anna Rząsa

MAIN OBJECTIVES: ASSES POLYMORPHISM OF MICROSATELLITES IDENTIFY MARKER-TRAIT ASSOCIATIONS METHODOLOGICAL OBJECTIVES: TOOLS FOR THE ANALYSIS OF SPARSE DATA

SELECTIVE (INDIVIDUAL) GENOTYPING MATERIALMETHODSRESULTSCONCLUSIONS qqQQ MORE POWER STANDARD (LINEAR) MODELS NOT VALID

SELECTIVE DNA POOLING MATERIALMETHODSRESULTSCONCLUSIONS qqQQ M1M2M2M3M3M4M4 QTL m1 M1 m1 m1 m2 m2 m3 m3 M4 M4 m4 m4 M1 m1 M1 M1 M2 M2 M3 M3 m4 m4 M4 M4

SELECTIVE DNA POOLING MATERIALMETHODSRESULTSCONCLUSIONS CHEAP~18%-60% more efficient (Barrat et al. 02) MORE POWERFULL~10%-70% less individuals HIGH TECHNICAL ERRORDNA pool formation (DNA quantification) DNA amplification (differential amplification, shadow bands) POOLING POPULATIONS:no relationship information testing for association POOLING HALFSIBS:partial relationship information testing for linkage

ANIMALS MATERIALMETHODSRESULTSCONCLUSIONS POLAR FOX (Alopex lagopus) NORWEGIAN TYPE “LARGE” FINNISH TYPE “SMALL” 63 77

MARKERS MARKERDOG GENOMEFOX GENOMEHET. REN112I0201 ? 0.76 C ? 0.77 C ? 0.76 FH ? 0.86 C ? 0.81 FH ? 0.82 C ? 0.79 G ? 0.64 REN153O1212 ? 0.76 REN227M1213 ? 0.74 FH ? 0.70 REN275L1916 ? 0.82 FH ? 0.77 REN100J1320 ? 0.83 REN128E2122 ? 0.70 LEI00227 ? 0.70 REN248F1430 ? 0.70 REN43H2431 ? 0.66 REN106I0736 ? 0.78 REN67C1837?0.83 MATERIALMETHODSRESULTSCONCLUSIONS

MARKERS MATERIALMETHODSRESULTSCONCLUSIONS MARKER SELECTION CRITERIA: POLYMORPHISM number of alleles allele lengths AMPLIFICATION PROPERTIES temperature ?

MARKER ALLELE FREQUENCY IN POOLS MATERIALMETHODSRESULTSCONCLUSIONS

MARKER ALLELE FREQUENCY IN POOLS MATERIALMETHODSRESULTSCONCLUSIONS LOW POLYMORPHISM WITHIN EACH POOL “POOL-SPECIFIC” ALLELES POOR CORRESPONDENCE BETWEEN REPLICATES

BINOMIAL DISTRIBUTION MATERIALMETHODSRESULTSCONCLUSIONS allele pool BINOMIAL DISTRIBUTION Odds Ratio, Logistic Regression allele pool n 12 2n 21 n 22 3n 31 n 32 4n 41 n 42

ODDS RATIO MATERIALMETHODSRESULTSCONCLUSIONS ln (OR) = ln distribution ln (OR) ~ N (0,1) variance ln (OR) = confidence intervals ln (OR)±

ODDS RATIO IN SPARSE DATA MATERIALMETHODSRESULTSCONCLUSIONS ln (OR) = ln allele pool SPARSE DATA PROBLEM ln (OR) = ln c = 0standard c = 0.5Haldane(55) c ij = 2 (n i. n.j / n 2 )Bishop(75) Agresti (99): c=0.5not valid for ln(OR)>4 c ij not valid for ln(OR)>8

ODDS RATIO: P-values MATERIALMETHODSRESULTSCONCLUSIONS

ODDS RATIO - CI MATERIALMETHODSRESULTSCONCLUSIONS 0.01 CI FOR “CONCORDANT” POOLS 0.01 CI FOR “DISCORDANT” POOLS

ODDS RATIO - REMARKS MATERIALMETHODSRESULTSCONCLUSIONS many 2x2 comparisons (theoretically) possible: 18 m4 – 60 m1,m6 significance pattern often inconsistent between alleles – sparse data difficult to summarize ORs with a single value marker C association C association C ? REN227M12 no association REN275L19 ? (sparse data) LEI002 ? (sparse data)

FURTHER WORK MATERIALMETHODSRESULTSCONCLUSIONS use all table cells account for sparseness in testing multivariate logistic models

MULTINOMIAL DISTRIBUTION MATERIALMETHODSRESULTSCONCLUSIONS allele pool MULTINOMIAL DISTRIBUTION Multivariate Logistic Regression allele pool n 12 2n 21 n 22 3n 31 n 32 4n 41 n 42 allele pool n 11 n 12 n 13 n 14 n 15 2n 21 n 22 n 23 n 24 n 25 3n 31 n 32 n 33 n 34 n 35 4n 41 n 42 n 43 n 44 n 45

MODEL MATERIALMETHODSRESULTSCONCLUSIONS GENERAL LOGISTIC MODEL CONSIDERED MODELS FOR ALLELE FREQUENCIES

TEST STATISTIC MATERIALMETHODSRESULTSCONCLUSIONS MODEL SELECTION POWER DIVERGENCE FAMILY Cressie, Read (1984) Pearson’s X 2 Likelihood Ratio Test estimated frequencies observed frequencies DATAMODEL

TEST STATISTIC MATERIALMETHODSRESULTSCONCLUSIONS NORMALISATION SPARSE DATA ! INCREASING CELLS ASYMPTOTICS ! ?

TEST STATISTIC MATERIALMETHODSRESULTSCONCLUSIONS ANALYTICAL Osius, Rojek (1989): D( =1) Farrington (1996):D( =1)+  Copas (1989):a*D ( = 1) EMPIRICAL – Bootstrap, Jackknife EVALUATION OF REAL DATA NORMAL PROPERTIES - simulation  D ?  D ?

LITERATURE MATERIALMETHODSRESULTSCONCLUSIONS Agresti, A. (1990) Categorical data analysis. New York, Chichester, Brisbane, Toronto, Singapore. John Wiley & Sons. Agresti, A. (1999) On logit confidence intervals for the odds ratio with small samples. Biometrics 55: Barratt, B. J., Payne, F., Rance, H. E.,Nutland, S., Todd, J. A., Clayton, D. G. (2002) Identification of the sources of error in allele frequency estimations from pooled DNA indicates an optimal experimental design. Annals of Human Genetics 66: Bishop, Y.M.M., Fienberg, S.E., Holland, P. (1975) Discrete multivariate analysis. Cambridge, Massachusetts: MIT Press. Copas, J.B. (1989) Unweighted Sum of Squares Test for Proportions. Applied Statistics 38: Cressie, N.A.C., Read, T.R.C. (1984) Multinomial goodness-of-t tests, Journal of the Royal Statistical Society Ser.B 46: Farrington, C.P. (1996) On assessing goodness of fit of generalized linear models to sparse data. Journal of the Royal Statistical Society Ser.B 58: Haldane, J.B.S. (1956) The estimation and significance of the logarithm of a ratio of frequencies. Annals of Human Genetics 20: Osius, G., Rojek, D. (1989) Normal goodness-of-fit tests for parametric multinomial models with large degrees of freedom. Fahbereich Mathematik/Informatik, Universitaet Bremen. Mathematik Arbeitspapiere 36: