Presentation is loading. Please wait.

Presentation is loading. Please wait.

Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.

Similar presentations


Presentation on theme: "Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone."— Presentation transcript:

1 Haplotype Discovery and Modeling

2 Identification of genes Identify the Phenotype MapClone

3 QTL Mapping Marker 1 Marker 2 Marker 3 Marker k QTL  A QTL (quantitative trait locus) is a gene that affects a quantitative trait,  The QTL detected by the markers linked with it is a chromosomal segment,  The DNA structure of a QTL is unknown.......

4 I II III 1 1 1 245 357 2 2468 aaBb AABbAaBbaabbAaBb Aabb AaBbaaBbAaBbAABbAAbbAabb aabb AabbaaBb QTL Mapping Based on Linkage

5 Mapping and sequencing 10000 Kb 100 Kb Markers DNA clones

6 SNPs (‘snips’) A SNP is a site in the DNA where different chromosomes differ in the base they have.

7 SNPs Paternal allele: CCCGCCTTCTTGGCTTTACA Maternal allele: CCCGCCTTCTCGGCTTTACA Paternal allele : CCCGCCTTCTTGGCTTTACA Maternal allele : CCCGCCTTCTTGGCTTTACA

8 HapMap Single Nucleotide Polymorphisms (SNPs) Insensitive to drug Sensitive to drug Detecting specific DNA sequence variants that determine complex traits The International HapMap Consortium (Nature, 2003, 2005)

9 Allele, Haplotype, and Diplotype Basic concepts

10 Haplotyping a Phenotype Basic concepts Quantitative Trait Nucleotide (QTN)

11 Risk Haplotype and Composite Diplotype Risk haplotype: [AB] = R Non-risk haplotype: [ Ab ], [ aB ], [ ab ] = r Composite Diplotype: RR, Rr, rr A B A B A B A B, Illustrations Basic concepts Consider A QTN composed of two SNPs: RR (2)Rr (1) rr (0)

12 Study design A random sample of unrelated individuals from a natural population SNP Group 12Diplotype Obs. Drug Response Trait 1AABB[AB][AB] n 11/11 y 1 = (y 11, …, y 1n11/11 ) T 2AABb[AB][Ab] n 11/10 y 2 = (y 21, …, y 2n11/10 ) T 3AAbb[Ab][Ab] n 11/00 y 3 = (y 31, …, y 3n11/00 ) T 4AaBB[AB][aB] n 10/11 y 4 = (y 41, …, y 4n10/11 ) T 5AaBb[AB][ab] n 10/10 y 5 = (y 51, …, y 5n10/10 ) T [Ab][aB] 6Aabb[Ab][ab] n 10/00 y 6 = (y 61, …, y 6n10/00 ) T 7aaBB[aB][aB] n 00/11 y 7 = (y 71, …, y 7n00/11 ) T 8aaBb[aB][ab] n 00/10 y 8 = (y 81, …, y 8n00/10 ) T 9aabb[ab][ab] n 00/00 y 9 = (y 91, …, y 9n00/00 ) T

13 Unifying Likelihood based on marker (S) and phenotype (y) data There are two types of parameters: - Haplotype frequencies (population genetic parameters  p ) [AB]: p 11 = pq+D [Ab]: p 10 = p(1-q)-D p – Allele (A) frequency at SNP 1 [aB]: p 01 = (1-p)q-Dq – Allele (B) frequency at SNP 2 [ab]: p 00 = (1-p)(1-q)+DD – Linkage disequilibrium - Haplotype effects and variation (quantitative genetic para.  q ) RR: µ 2 = µ + a a = additive effect Rr: µ 1 = µ + d d = dominance effect rr: µ 0 = µ - a Liu, Johnson, Casella and Wu, 2004, Genetics

14 Modeling Haplotype Frequencies SNP Group 12DiplotypeFrequency Obs. 1AABB[AB][AB] p 2 11 n 11/11 2AABb[AB][Ab] 2p 11 p 10 n 11/10 3AAbb[Ab][Ab] p 2 10 n 11/00 4AaBB[AB][aB] 2p 11 p 01 n 10/11 5AaBb[AB][ab] 2p 11 p 00 n 10/10 [Ab][aB] 2p 10 p 01 6Aabb[Ab][ab] 2p 10 p 00 n 10/00 7aaBB[aB][aB] p 2 01 n 00/11 8aaBb[aB][ab] 2p 01 p 00 n 00/10 9aabb[ab][ab] p 2 00 n 00/00

15 EM algorithm E step M step

16 Modeling Haplotype Effects SNP Risk Haplotype 12 [AB] [Ab] [aB] [ab] 1AABB[AB][AB]RR rrrrrr 2AABb[AB][Ab] RrRrrrrr 3AAbb[Ab][Ab] rrRRrrrr 4AaBB[AB][aB] RrrrRrrr 5AaBb[AB][ab] RrrrrrRr [Ab][aB] rr Rr Rr rr 6Aabb[Ab][ab] rrRrrrRr 7AaBB[aB][aB] rrrrRRrr 8AaBb[aB][ab]rrrrRrRr 9Aabb[ab][ab]rrrrrrRR Likelihood L1 L2 L3 L4 Genotypic values of composite diplotypes: RR  u 2, Rr  u 1, rr  u 0

17 Mixture Model assuming that [AB] is the risk haplotype

18 EM Algorithm E step M step

19 Hypothesis Testing H0: µ 2 = µ 1 = µ 0 = 0 RR = Rr = rr H1: At least one of equalities in the H0 does not hold LR = –2ln[L 0 ( |y) – L 1 ( |y,S, )] The threshold is determined empirically by permutation tests

20 Genome-wide Scan LR SNPs on the Genome Threshold

21 Structural Variation in the Human Genome Haplotype Blocks: Nearby SNPs are often distributed in block-like patterns Hotspots and Coldspots: SNPs from different blocks have larger recombination rates than those from within blocks Tag SNPs: Haplotype diversity within each block can be well explained by a small portion of SNPs. Recombination Hot Spots Block 1 Block 2 Block 3 Block 4 …

22 A Genetic Study A candidate gene for human obesity SNP A: A, G SNP B: C, G Four haplotypes [AC] [AG] [GC] [GG] A total of 155 patients selected from a population Typed for the two SNPs Measured for body mass index (BMI) Question: Which haplotype triggers an effect on BMI?

23 Testing Risk Haplotype LR [AC] 2.32 r [AG] 1.52 r [GC] 3.11 r [GG] 10.35 (p<0.01) R RR: µ 2 = µ + a = 30.83 – 1.77 = 29.06 a = additive effect Rr: µ 1 = µ + d = 30.83 – 3.05 = 27.78 d = dominance effect rr: µ 0 = µ - a = 30.83 + 1.77 = 32.60 A patient who combines haplotype [GG] with any other haplotypes is normal weight, A patient who combines any two haplotypes from [AC], [AG] and [GC] is obese, A patient who has double haplotypes [GG] is overweight

24 Model Extensions Block-Block Interactions (Lin et al. 2007, Bioinformatics) Haplotype-Environment Interactions (Wang et al. 2008, Molecular Pain) Haplotype Imprinting Effects (Cheng et al., to be submitted) Multivariate high-dimensional drug response (PK-PD link, efficacy and toxicity…) – A systems approach

25 1000-Genome Projects  This sequencing effort will produce most detailed map of human genetic variation to support disease studies  Results will help to design the personalized medication which can optimize drug therapy


Download ppt "Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone."

Similar presentations


Ads by Google