Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt SNP Applications.

Similar presentations


Presentation on theme: "Statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt SNP Applications."— Presentation transcript:

1 statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt SNP Applications

2 Human Genome and SNPs Now that the human genome is (mostly) sequenced, attention turning to the evaluation of variation Alterations in DNA involving a single base pair are called single nucleotide polymorphisms, or SNPs Map of ~1.4 million SNPs (Feb 2001) It is estimated that ~60,000 SNPs occur within exons; 85% of exons within 5 kb of nearest SNP

3 SNP Initiatives Industrial –Genset –Incyte –Celera –CuraGen Academic – Industry Consortium Governmental –US –Japan Non-industrial scale academic programs

4 Goals of SNP Initiatives Immediate goals: –Detection/identification of … –The hundreds of thousands of SNPs estimated to be present in the human genome –Interest also in other organisms, e.g. potatoes(!) –Establishment of SNP Database(s)

5 Longer term goals: Areas of SNP Application Gene discovery and mapping Association-based candidate polymorphism testing Diagnostics/risk profiling Response prediction Homogeneity testing/study design Gene function identification …etc. See Schork, Fallin, Lanchbury 2000

6 Polymorphism Technical definition: most common variant (allele) occurs with less than 99% frequency in the population Also used as a general term for variation Many types of DNA polymorphisms, including RFLPs, VNTRs, microsatellites ‘Highly polymorphic’ = many variants

7 Use of Polymorphism in Gene Mapping 1980s – RFLP marker maps 1990s – microsatellite marker maps

8 SNPs in Genetic Analysis Abundance – lots Position – throughout genome Haplotype patterns – groups of SNPs may provide exploitable diversity Rapid and efficient to genotype Increased stability over other types of mutation Recombination patterns – e.g. ‘hot spots’

9 Gene Discovery and Mapping Linkage Analysis –Within-family associations between marker and putative trait loci Linkage Disequilibrium (LD) –Across-family associations

10 One locus: Founder genotype probabilities Founder: individual whose parents are not in the pedigree Usually obtain genotype probs. assuming Hardy-Weinberg Equilibrium (HWE): Say P(D) = p, P(d) = 1-p; Then P(DD) = p 2, P(Dd) = 2p(1-p), P(dd) = (1-p) 2 Genotypes of founder couples treated as independent: P(Father Dd and Mother DD) = 2p(1-p) 3

11 One locus: Transmission probabilities (I) Offspring get their genes according to Mendel’s rules… Independently for different offspring P(3 dd | 1 Dd & 2 Dd) = ½ x ½ 12 3 DdDd DdDd d

12 One locus: Transmission probabilities (II) 12 3 DdDd DdDd d P(3 dd & 4 Dd & 5 DD| 1 Dd & 2 Dd) = (½ x ½) x (2 x ½ x ½) x (½ x ½) 5 4 DdDdD

13 One locus: Penetrance Usual to assume that the chance of having a particular phenotype (being affected with a disease, say) depends only on the genotype at one locus Complete penetrance: P(affected|DD) = 1 Incomplete penetrance: P(affected|DD) = p (<1)

14 One locus: putting it all together 12 3 DdDd DdDd d P(pedigree) = (2 x.01 x.99 x.7) x (2 x.01 x.99 x.3) x (½ x ½ x.9) x (2 x ½ x ½ x.7) x (½ x ½ x.8) 5 4 DdDdD Assume: P(Aff|dd) =.1 P(Aff|Dd) =.3 P(Aff|DD) =.8 P(D) =.01

15 Crossing over and Recombination

16 Two loci: Linkage and Recombination 12 3 Dd TT Dd tt Dd Tt TT D(1-  )/2  /2½ d (1-  )/2½ ½½ 3 produces gametes in proportions:

17 Recombination Fraction  = ½ : independent assortment (Mendel)  < ½ : linked loci  = 0 : tightly linked loci (no recombination) In 3, if the loci are linked then D-T and d-t are parental haplotypes, D-t and d-T are recombinant haplotypes

18 LOD-score Linkage Analysis LOD(  * ) = log 10 of the odds ratio L: L = P(data|  * )/P(data|½) LOD(  * ) measures the relative strength of the data for  =  * rather than  = ½ Can compute LOD(  ) at several values Can find the value  maximizing the LOD

19 IBD Allele Sharing

20 Allele-sharing Methods Based on number (or proportion) of alleles shared identical by descent (IBD) of related individuals Can be done either assuming (likelihood-based) or not assuming (nonparametric) a genetic mode of inheritance for a trait

21 Errors Genotyping errors can result in false positive or false negative findings Data checking/cleaning necessary (although there are approaches which model error) Must be especially careful with SNP genotypes, because errors often pass simple Mendelian checks

22 Disease-Marker Association A marker locus is associated with a disease if the distribution of genotypes at the marker locus in disease-affected individuals differs from the distribution in the general population A specific allele may be positively associated (over-represented in affecteds) or negatively associated (under-represented)

23 Examples: Alzheimer’s Alzheimer’s disease and ApoE E4 presentE4 absent Patients5833 Controls1655 The E4 allele appears to be positively associated with Alzheimer’s disease: Odds Ratio = (58/16)/(33/55) = 6

24 Examples: HLA DiseaseAlleleRR Ankylosing spondylitis B2787 Myasthenia gravisB84.1 Systemic lupus erythematosus B82.1 HemachromotosisA38.2 (and many more…)

25 Linkage Disequilibrium Disease locus Alleles D, d Marker locus Alleles M, m LD Disease penetrance

26 Linkage Disequilibrium Concept of the ‘historical recombinant’ Explanations for observed association between marker and disease: –Marker locus may be a disease susceptibility locus –Marker locus may be linked to disease susceptibility locus –Spurious result due, e.g. to admixture, population stratification, heterogeneity

27 Linkage and LD Mutation occurs Nearby marker Allele D is created Allele M was nearby D and M subsequently transmitted together

28 Candidate Polymorphism Testing Linkage and LD assume markers have indirect association with the trait Large SNP collections may allow testing for direct, physiologically relevant associations with trait

29 Diagnostics/Risk Profiling Identified SNP associations can potentially be used to develop diagnostic tools Applicability will require large-scale studies, since most diseases of interest now are influenced by many genetic and nongenetic factors

30 Response Prediction Related to diagnosis/risk assessment Strategy: stratify populations to improve effectiveness of interventions Pharmaceutical companies especially interested in this: –Aim to identify those likely to respond –Predict toxicity reactions in susceptible individuals Response to any kind of substance; creation of ‘functional foods’

31 Homogeneity Testing Test to protect against false inferences about the relationship between endpoints (e.g. disease) and risk factors Assess generalizability of results Can assess the homogeneity of the genetic background of study participants using a panel of randomly distributed SNPs

32 Gene Function Identification Alternative to other experimental procedures (e.g. knock-outs, which cannot be used in humans) Studies to compare individuals with and without naturally occurring disease predisposing genetic profiles

33 Haplotype Variation The large databases already available (and increasing in size) should allow characterization of haplotype variation across the genome in different populations Can help population geneticists trace evolution and reveal connections between populations/ethnic groups


Download ppt "Statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt SNP Applications."

Similar presentations


Ads by Google