Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yandell © 2003JSM 20031 Dimension Reduction for Mapping mRNA Abundance as Quantitative Traits Brian S. Yandell University of Wisconsin-Madison www.stat.wisc.edu/~yandell/statgen.

Similar presentations


Presentation on theme: "Yandell © 2003JSM 20031 Dimension Reduction for Mapping mRNA Abundance as Quantitative Traits Brian S. Yandell University of Wisconsin-Madison www.stat.wisc.edu/~yandell/statgen."— Presentation transcript:

1 Yandell © 2003JSM 20031 Dimension Reduction for Mapping mRNA Abundance as Quantitative Traits Brian S. Yandell University of Wisconsin-Madison www.stat.wisc.edu/~yandell/statgen [Lan et al. 2003 Genetics 164(4): August 2003]

2 Yandell © 2003JSM 20032 what is the goal of QTL study? uncover underlying biochemistry –identify how networks function, break down –find useful candidates for (medical) intervention –epistasis may play key role –statistical goal: maximize number of correctly identified QTL basic science/evolution –how is the genome organized? –identify units of natural selection –additive effects may be most important (Wright/Fisher debate) –statistical goal: maximize number of correctly identified QTL select “elite” individuals –predict phenotype (breeding value) using suite of characteristics (phenotypes) translated into a few QTL –statistical goal: mimimize prediction error

3 Yandell © 2003JSM 20033 what is a QTL? QTL = quantitative trait locus (or loci) –trait = phenotype = characteristic of interest –quantitative = measured somehow glucose, insulin, gene expression level Mendelian genetics –allelic effect + environmental variation –locus = location in genome affecting trait gene or collection of tightly linked genes some physical feature of genome

4 Yandell © 2003JSM 20034 typical phenotype assumptions normal "bell-shaped" environmental variation genotypic value G Q is composite of m QTL genetic uncorrelated with environment data histogram

5 Yandell © 2003JSM 20035 why worry about multiple QTL? so many possible genetic architectures! –number and positions of loci –gene action: additive, dominance, epistasis –how to efficiently search the model space? how to select “best” or “better” model(s)? –what criteria to use? where to draw the line? –shades of gray: exploratory vs. confirmatory study –how to balance false positives, false negatives? what are the key “features” of model? –means, variances & covariances, confidence regions –marginal or conditional distributions

6 Yandell © 2003JSM 20036 Pareto diagram of QTL effects 5 4 3 2 1 major QTL on linkage map major QTL minor QTL polygenes (modifiers)

7 Yandell © 2003JSM 20037 advantages of multiple QTL approach improve statistical power, precision –increase number of QTL detected –better estimates of loci: less bias, smaller intervals improve inference of complex genetic architecture –patterns and individual elements of epistasis –appropriate estimates of means, variances, covariances asymptotically unbiased, efficient –assess relative contributions of different QTL improve estimates of genotypic values –less bias (more accurate) and smaller variance (more precise) –mean squared error = MSE = (bias) 2 + variance

8 Yandell © 2003JSM 20038 epistasis in parallel pathways (Gary Churchill) Z keeps trait value low Neither E 1 nor E 2 is rate limiting Loss of function alleles are segregating from parent A at E 1 and from parent B at E 2 Z X Y E 1 E 2

9 Yandell © 2003JSM 20039 epistasis in a serial pathway (Gary Churchill) ZXY E 1 E 2 Z keeps trait value high Neither E 1 nor E 2 is rate limiting Loss of function alleles are segregating from parent B at E 1 and from parent A at E 2

10 Yandell © 2003JSM 200310 epistasis examples (Doebley Stec Gustus 1995; Zeng pers. comm.) traits 1,4,9 1: dom-dom interaction 4: add-add interaction 9: rec-rec interaction (Fisher-Cockerham effects) 4 91

11 Yandell © 2003JSM 200311 why map gene expression as a quantitative trait? cis- or trans-action? –does gene control its own expression? –evidence for both modes (Brem et al. 2002 Science) mechanics of gene expression mapping –measure gene expression in intercross (F2) population –map expression as quantitative trait (QTL technology) –adjust for multiple testing via false discovery rate research groups working on expression QTLs –review by Cheung and Spielman (2002 Nat Gen Suppl) –Kruglyak (Brem et al. 2002 Science) –Doerge et al. (Purdue); Jansen et al. (Waginingen) –Williams et al. (U KY); Lusis et al. (UCLA) –Dumas et al. (2000 J Hypertension)

12 Yandell © 2003JSM 200312 idea of mapping microarrays (Jansen Nap 2001)

13 Yandell © 2003JSM 200313 goal: unravel biochemical pathways (Jansen Nap 2001)

14 Yandell © 2003JSM 200314 central dogma via microarrays (Bochner 2003)

15 Yandell © 2003JSM 200315 coordinated expression in mouse genome (Schadt et al. 2003 Nature) expression pleiotropy in yeast genome (Brem et al. 2002 Science)

16 Yandell © 2003JSM 200316 glucoseinsulin (courtesy AD Attie)

17 Yandell © 2003JSM 200317 Insulin Requirement from Unger & Orci FASEB J. (2001) 15,312 decompensation

18 Yandell © 2003JSM 200318 Type 2 Diabetes Mellitus from Unger & Orci FASEB J. (2001) 15,312

19 Yandell © 2003JSM 200319 studying diabetes in an F2 segregating cross of inbred lines –B6.ob x BTBR.ob  F1  F2 –selected mice with ob/ob alleles at leptin gene (chr 6) –measured and mapped body weight, insulin, glucose at various ages (Stoehr et al. 2000 Diabetes) –sacrificed at 14 weeks, tissues preserved gene expression data –Affymetrix microarrays on parental strains, F1 key tissues: adipose, liver, muscle,  -cells novel discoveries of differential expression (Nadler et al. 2000 PNAS; Lan et al. 2002 in review; Ntambi et al. 2002 PNAS) –RT-PCR on 108 F2 mice liver tissues 15 genes, selected as important in diabetes pathways SCD1, PEPCK, ACO, FAS, GPAT, PPARgamma, PPARalpha, G6Pase, PDI,…

20 Yandell © 2003JSM 200320 LOD map for PDI: cis-regulation Lan et al. (2003 submitted)

21 Yandell © 2003JSM 200321 Multiple Interval Mapping SCD1: multiple QTL plus epistasis!

22 Yandell © 2003JSM 200322 2-D scan: assumes only 2 QTL! epistasis LOD joint LOD

23 Yandell © 2003JSM 200323 trans-acting QTL for SCD1 (no epistasis yet: see Yi, Xu, Allison 2003) dominance?

24 Yandell © 2003JSM 200324 Bayesian model assessment: chromosome QTL pattern for SCD1

25 Yandell © 2003JSM 200325 high throughput dilemma want to focus on gene expression network –hundreds or thousands of genes/proteins to monitor –ideally capture networks in a few dimensions multivariate summaries of multiple traits –elicit biochemical pathways (Henderson et al. Hoeschele 2001; Ong Page 2002) may have multiple controlling loci –allow for complicated genetic architecture –could affect many genes in coordinated fashion –could show evidence of epistasis –quick assessment via interval mapping may be misleading

26 Yandell © 2003JSM 200326 why study multiple traits together? environmental correlation –non-genetic, controllable by design –historical correlation (learned behavior) –physiological correlation (same body) genetic correlation –pleiotropy one gene, many functions common biochemical pathway, splicing variants –close linkage two tightly linked genes genotypes Q are collinear

27 Yandell © 2003JSM 200327 high throughput: which genes are the key players? one approach: clustering of expression seed by insulin, glucose advantage: subset relevant to trait disadvantage: still many genes to study

28 Yandell © 2003JSM 200328 PC simply rotates & rescales to find major axes of variation

29 Yandell © 2003JSM 200329 multivariate screen for gene expressing mapping principal components PC1(red) and SCD(black) PC2 (22%) PC1 (42%)

30 Yandell © 2003JSM 200330 mapping first diabetes PC as a trait

31 Yandell © 2003JSM 200331 pFDR for PC1 analysis prior probability fraction of posterior found in tails

32 Yandell © 2003JSM 200332 false detection rates and thresholds multiple comparisons: test QTL across genome –size = pr( LOD( ) > threshold | no QTL at ) –threshold guards against a single false detection very conservative on genome-wide basis –difficult to extend to multiple QTL positive false discovery rate (Storey 2001) –pFDR = pr( no QTL at | LOD( ) > threshold ) –Bayesian posterior HPD region based on threshold  = { | LOD( ) > threshold }  { | pr( | Y,X,m ) large } –extends naturally to multiple QTL

33 Yandell © 2003JSM 200333 pFDR and QTL posterior positive false detection rate –pFDR = pr( no QTL at | Y,X, in  ) –pFDR = pr(H=0)*size pr(m=0)*size+pr(m>0)*power –power = posterior = pr(QTL in  | Y,X, m>0 ) –size = (length of  ) / (length of genome) extends to other model comparisons –m = 1 vs. m = 2 or more QTL –pattern = ch1,ch2,ch3 vs. pattern > 2*ch1,ch2,ch3


Download ppt "Yandell © 2003JSM 20031 Dimension Reduction for Mapping mRNA Abundance as Quantitative Traits Brian S. Yandell University of Wisconsin-Madison www.stat.wisc.edu/~yandell/statgen."

Similar presentations


Ads by Google