Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genotyping and Genetic Maps Bas Heijmans Leiden University Medical Centre The Netherlands.

Similar presentations


Presentation on theme: "Genotyping and Genetic Maps Bas Heijmans Leiden University Medical Centre The Netherlands."— Presentation transcript:

1 Genotyping and Genetic Maps Bas Heijmans Leiden University Medical Centre The Netherlands

2 111122222111122222 123412345123412345 001100111001100111 002200222002200222 111111111111111111 132200565132200565 243400877243400877 Pedigree file in linkage format 122112121122112121

3 111122222111122222 123412345123412345 001100111001100111 002200222002200222 111111111111111111 132200565132200565 243400877243400877 122112121122112121 family id person id father mother sex disease status marker data (1 marker)

4 Marker choice for genome-wide linkage scans Short tandem repeats (STR, a.k.a. microsatellites) because: High heterozygosity (1 STR ~ 5 SNPs) There are more than enough (1/30kb thus >>1/cM) Reliable genetic maps (Marshfield, Decode) Optimized marker sets, spacing down to 5cM (Marshfield/Applied Biosystems) Reasonably automated measurement (2 persons  40,000 checked genotypes in database per week) Low cost per genotype (<$0.15 for consumables) Reasonable success and error rates (>92% and <0.8%)

5 Short tandem repeats AACTAACTAACTAACT TTGATTGATTGATTGAAACTTTGA Paternal allele Maternal allele 4 repeats 2 repeats Tetranucleotide repeat:

6 Short tandem repeats AACTAACTAACTAACT TTGATTGATTGATTGAAACTTTGA Paternal allele Maternal allele 4 repeats 2 repeats Tetranucleotide repeat: CACACACACACACACA GTGTGTGTGTGTGTGT CACACA GTGTGT Paternal allele Maternal allele 8 repeats 3 repeats Dinucleotide repeat: And there also are tri- and pentanucleotide repeats….

7 Principle of genotyping methods CACACACACACACACA GTGTGTGTGTGTGTGT CACACA GTGTGT Short tandem repeats  length differences GCGC ATAT SNPs  only sequence difference Destruction restriction site (RFLP) Hybridization differences (TaqMan) One base-pair sequencing reaction- primer extension (Sequenom, Orchid) Ligation assay (Illumina) VNTR, insertion/deletion polymorphisms (1 bp to ~300 bp for Alu repeat)

8 Genotyping STRs – step 1: PCR

9 CAGT 203525420 104 bp ++++= CACACACA GTGTGTGT 203525820 108 bp ++++=

10 genomic DNA + primers + Taq DNA polymerase + dNTPs (ACGT) + buffer Genotyping STRs – step 1: PCR in practice

11 Agarose or polyacrylamide slab gel DNA is negatively charged Longer fragments migrate slower than shorter ones through polymer network. — electrode + electrode Genotyping STRs – step 2: electophoresis Detect length differences

12 To scan the whole human genome… 1 short tandem repeat every 10 cM makes 400 markers per individual Assuming 1000 individuals (preferably 1000s) One whole genome scan = 400,000 genotypings

13 Not like this…….

14 Not like this……. but like this 96-well plates 384-well plates

15 Not like this…….

16 Not like this……. but like this

17 Not like this…….

18 Not like this……. but like this

19 96 capillaries (no lanes) (ABI3700) Put in machine and all goes automatically Primers are labelled with fluorescent dye Machine detects PCR products through a laser Electrophoresis using automated sequencer TCAG TGTGTG ACACAC GTCA CAGT Typically 15 markers in one capillary: start 2.5 h A bit later Laser Detector - +

20 Through-put A 384-well plate taking about one night 384 samples minus 16 controls = 368 15 markers per sample makes 5520 genotypes (if succes rate 100%)

21 Tetranucleotide repeat marker (e.g. multiples of AACT)

22 Detected length of PCR product depends on machine Standards are used to correct this (CEPH DNA samples) Take this into account when analysing data from different machines/labs

23 Dinucleotide repeat marker (e.g. multiples of CA)

24 Dinucleotide repeats give less clean pictures but in practice this is no problem as long as pattern is always the same However, markers not in standard 10 cM screening sets often are more problematic (different stutter patterns for different samples, non-constant ratio ‘real peak’/plus-A peak)  increased error rates?

25 The result: allele lengths CAGT 203525420 104 bp ++++= CACACACA GTGTGTGT 203525820 108 bp ++++=

26 111122222111122222 123412345123412345 001100111001100111 002200222002200222 111111111111111111 Pedigree file in linkage format 122112121122112121 102 106 104 0 111 112 111 104 110 106 110 0 118 114 Raw marker data 132200565132200565 243400877243400877 Renumbered data

27 Genetic map of measured markers For IBD estimation using Merlin or other software Pedigree file Genetic map

28 Markers measured on chromosome 19 16 markers d19s247 d19s1034 d19s391 d19s865 d19s394 d19s588 d19s49 d19s433 d19s47 d19s420 d19s178 apoc2 d19s246 d19s180 d19s210 d19s254

29 Genetic maps Available from Marshfield Center for Medical Genetics http://research.marshfieldclinic.org/genetics/ Decode Genetics (most accurate) Supplemental data to Kong et al. Nat Genet 2002;31:241-7. see F:\Bas\Genotyping&Maps\DecodeMap.xls

30

31

32

33

34

35

36

37

38 Merlin Map File CHROMOSOMEMARKERLOCATION 19 d19s2479.84 19 d19s103420.75 19 d19s39128.83 19 d19s86532.39 19 d19s39434.25 19 d19s58842.28 19 d19s4950.81 19 d19s43351.88 19 d19s4763.10 19d19s42066.30 19 d19s17868.08 19 apoc269.50 19 d19s24678.08 19 d19s18087.66 19 d19s210100.01 19 d19s254 100.61


Download ppt "Genotyping and Genetic Maps Bas Heijmans Leiden University Medical Centre The Netherlands."

Similar presentations


Ads by Google