Presentation on theme: "Tom Price MRC SGDP Centre, Institute of Psychiatry"— Presentation transcript:
1Tom Price MRC SGDP Centre, Institute of Psychiatry Linkage analysis and eQTL studiesTom PriceMRC SGDP Centre, Institute of PsychiatrySystems Biomedicine Graduate Programme 2008/9
2Genetic Linkage Studies Use the inheritance of markers within families to identify chromosomal regions where disease genes may lieGenetic markersM7M1M2M3M4M5M6Diseasesusceptibility gene
3Or linkage between marker and disease locus? Linkage Pedigree2 21 1Disease casesGenotype2 13 31 31 31 32 32 32 3Random chance?Or linkage between marker and disease locus?
4The Possibilities SIMPLE Multiple alleles of a single gene Different alleles different effectsTrinucleotide repeat diseasesMendelianOne gene = one traitCystic fibrosisCONTINUOUS DISCRETEQuantitative traitsMultiple genes and environmentHeightNon-MendelianMultiple genes and environmentEpilepsy, liability to strokeCOMPLEX
5One Gene, One Trait? Three laws of heredity Laws of heredity discovered by Mendel 1865Three laws of heredity
6Mendel’s Laws Dominance Segregation Independent assortment When two contrasting characters are crossed only one appears in the next generationSegregationFor each trait, a gamete carries only one of the two parental allelesIndependent assortmentAlleles for different traits are inherited independently of each other
8Mendel’s Laws Dominance Segregation Independent assortment When two contrasting characters are crossed only one appears in the next generationSegregationFor each trait, a gamete carries only one of the two parental allelesIndependent assortmentAlleles for different traits are inherited independently of each other
10Mendel’s Laws Dominance Segregation Independent assortment When two contrasting characters are crossed only one appears in the next generationSegregationFor each trait, a gamete carries only one of the two parental allelesIndependent assortmentAlleles for different traits are inherited independently of each other
11Independent Assortment Eye colour IS NOT predictable from hair colourBlonde hair and brown or blue eyesBrown hair and blue or brown eyes
12X Mendel’s Laws Dominance Segregation Independent assortment When two contrasting characters are crossed only one appears in the next generationSegregationFor each trait, a gamete carries only one of the two parental allelesIndependent assortmentAlleles for different traits are inherited independently of each otherFailure of the third law makes gene-finding possible.X
13Independent Assortment Eye colour IS often predictable from hair colourBlonde hair and blue eyesBrown hair and dark eyes
14What is Linkage?A method to map the relative positions of two or more loci using genetic markersOccurs because loci do not obey Mendel’s third law
15Breaking the Third Law A, B, O = blood group genes affected, unaffectedAdapted from Phillip McLean
16Breaking the Third Law A, B, O = blood group alleles affected, unaffectedAdapted from Phillip McLean
17Breaking the Third Law ABO locus predicts D locus A, B, O = blood group allelesaffected, unaffectedAdapted from Phillip McLean
18Genetics for Card Players We can think of genetic information as a deck of cards.The closer 2 cards are, the less likely it is that they will separate during shuffling.If not much shuffling has occurred, more distant cards can act as markers.
19Linkage Groups If inheritance of two loci is independent They are unlinkedIf inheritance of two loci is dependentThey are in the same linkage groupLinkage groups correspond to the physical structures called chromosomes
20Chromosomes Chromosomes are NOT inherited as a single block Recombination occurs at meiosisAffects co-inheritance of alleles
21Recombination and Meiosis Nearby loci A and B are likely to co-segregate during meiosis.Distant loci B and C are less likely to co-segregate during meiosis.
22Recombination For any pair of markers Parental pattern = NR Mixed pattern = RAaBbccddABabNon-recombinant gametesAcBdAcBdAcbdacBdNR NR R R
23Recombination For any pair of markers Parental pattern = NR Mixed pattern = RAaBbccddAbaBRecombinant gametesAcBdAcBdAcbdacBdNR NR R R
24Recombination For any pair of markers Parental pattern = NR Mixed pattern = RAaBbccddAcBdAcBdAcbdacBdNR NR R R
25Recombination Fraction = The proportion of offspring that are recombinant between two lociRF = 0.5 between unlinked loci (e.g. different chromosomes)
26Parametric Linkage Analysis Uses pedigree information to estimate recombination fraction between markers and diseaseAssumes a particular model of inheritance (additive, dominant, recessive)Useful for Mendelian disorders (single gene)
27Allele SharingPeople with rare diseases are more highly related to each other near the disease-causing gene than you would typically expect.This is because nearby markers tend to be inherited together with the disease locus.We can look for excess allele sharing as a signal that a disease locus is nearby.
28Identity By StateWhen two individuals possess the same alleles at a locus, they are said to be identical by state (IBS).For example, these affected sibs share one allele IBS, the allele a.acad
29Identity By StateBut if the parental genotypes are unknown, we do not know whether the offspring have inherited the a allele from the same parent or from different parents.We can’t established shared inheritance, so IBS allele sharing is useless for linkage analysis.????acad
30Identity By DescentIndividuals who share copies of a common ancestral allele are said to be identical by descent (IBD).For example, these affected sibs share one allele IBD. The paternal allele a has been transmitted to both offspring.abcdacad
32Allele Sharing in Affected Sib Pair Sibling genotypesAlleles shared IBDExpected Probabilityac ac2ac ad1ac bcac bdabcdac??Probability under random transmission of marker alleles.
33Allele Sharing in Affected Sib Pair Sibling genotypesAlleles shared IBDExpected Probabilityac ac2ac ad1ac bcac bdabcdac??Probability under random transmission of marker alleles.But what if the marker lies near a disease gene?Affected siblings are more likely to share marker alleles IBD.
34Non-parametric Linkage Analysis Uses information on IBD allele sharingUsually between affected sibsDo not need to specify the model of inheritance at any locusUseful for complex traits (multiple genes, different modes of inheritance)
40Complications of Linkage Analysis abcd????With unknown parental genotypes, allele sharing must be estimated using population allele frequenciesFamilies with less than four alleles may give unclear sharingMultipoint linkage analysis, using information from adjacent markers, will increase power to detect genesComputationally intensive: use computer programs to calculate LOD scoresOther problems due to non-paternity, genotyping errors, sample mix-ups, poor phenotype definition
41Software Several programs are available, including: LINKAGE MLINK Parametric:LINKAGEMLINKNon-parametric:MERLINGENEHUNTER
42Linkage Study DesignCandidate gene search: dense marker genotyping within a region of positional or functional interestGenome search:Aim to identify several susceptibility genesFamilies are genotyped on polymorphic markers across all chromosomesmicrosatellite markers across genome, separated by 10cM(or, more recently, 10,000 SNP markers)tighter marker spacing gives more informationfew markers makes it difficult to reconstruct haplotypes, particularly without parental genotypes
43Significance LevelLander and Kruglyak (1994) suggested criteria for affected sibling pair studies in complex diseasesLOD score > 2.2 suggestive linkageLOD score > 3.6 significant linkageThese LOD scores are expected to occur by chance in 1 and 1/20 times in a genome search, respectivelyMany studies of complex disease do not reach these cut-offsAnother approach is to report highest LOD scores even if they are below these thresholds and look for replication across studies
44Does It Work?Very powerful for mapping single gene disorders, e.g. early-onset Alzheimer’s Disease, many forms of mental retardation…
45Does It Work?Very powerful for mapping single gene disorders, e.g. early-onset Alzheimer’s Disease, many forms of mental retardation……but many non-replications for complex traits
46Linkage v Association Linkage Association Usual sample Families Unrelated individuals (e.g. case control)Good for findingRare variants with large effectsCommon variants with small effectsIdentifiesBroad chromosomal regionNarrow region usually within a single gene
47BreakNext up: application of linkage analysis to gene expression phenotypes.
49Finding Disease Pathways Conduct linkage/association study to find candidateDetermine candidate gene function experimentallyProblems:Markers only give regional information, the identity of the causal variations remains obscureMany GWAS hits are nowhere near any genesReliance on animal and in vitro models to probe function
50Genetics of Gene Expression Linkage study or GWAS using mRNA abundance as the phenotypeMotivation:mRNA abundance as ‘endophenotype’Lies on causal path between genetic variation and diseaseHits (‘eQTLs’) may have less complex inheritanceLarger effect sizes, fewer causal variants?We may already know which transcripts are dysregulated in diseased tissueseQTLs can provide a link to finding susceptibility genes
55Human eQTL Studies Selected list 1st Author Year Journal Population SampleTissueMeasureGenotypingMorley2004NatureCEPH14 pedigreesLCL8K AffyLinkage scanMonksAJHG15 pedigrees25K oligoDixon2007Nat GenMRC-A206 families54K AffyGoringSAFHS1240 individualsLympho-cytes47K IlluminaIllumina 100KStrangerScienceHapMap270 individuals2M SNPs + 7K CNVsEmilsson2008IFB/IFA1002/673 individualsBlood/ AdiposeIllumina 370K + Linkage scanSelected list
56Largest human eQTL study to date 1,240 subjects from extended pedigreesBlood lymphocytes, not lymphocyte cell lines47K Illumina WG-6 Series I microarrayExpression adjusted for age, sex
5785% of 19,648 transcripts detected Heritability85% of 19,648 transcripts detectedwere heritable (FDR 5%)
58Cis RegulationSingle LOD score calculated at gene locus to identify cis-regulated transcripts1,345 (6.8%) cis-regulated transcripts detected (FDR 5%)eQTL effect size overall:median 1.8%, mean 5.0%eQTL effect size in significant loci:median 24.6%, mean 29.1%
59Trans Regulation Much lower power No evidence of master regulators: only 58 transcripts had 2+ peaks with LOD > 3
60Gene Discovery Using eQTLs Promoter variants in VNN1 are associated with transcript abundance and HDL-C concentration
61Consistency of Results Morley cis eQTLs confirmed by Göring, but not trans eQTLs.This is consistent with tissue specificity of trans regulation, but also with lower power to detect trans effects.
62Linkage & association study Icelandic subjectsBlood and adipose tissue samplesExpression adjusted for age, sex, BMI
63Tissue Specificity Cis Trans Blood 2,529 52 Adipose 1,489 25 Both 762 ?Linkage eQTLs (FDR 5%) for 20,877 expression traits, 10,364 of them heritable (FDR 5%)
64Proximity of Cis Acting Variants Association eSNPs were within 100kb of the probe for 96% of expression traits with strong cis-acting effects
65Potential ProblemMicroarray probes overlapping SNPs can give rise to spurious cis eQTLsOlder studies did not have so much resequencing data available to identify probes containing SNPs
66Further Directions Animal models (e.g. mouse F2 crosses) Other tissues (e.g. mouse brain)Evoked phenotypesgenetics of expression response to e.g ionizing radiation, drug/hormone treatmentCausal modellingGenotype data can establish whether expression changes cause disease or are a consequence of itSchadt et al. (2005) Nature Genetics 37: