Download presentation
Presentation is loading. Please wait.
1
Analyzing human population genetic history through the study of genetic variation Mark Mata Mentor: Eleazar Eskin UCLA Zar Lab SoCalBSI 2009
2
Background To study human population genetic history is to study parts of human evolution Human evolution is one of the fundamental questions in science We ask ourselves many questions like: Where do we come from? Why are we all different? How are we all different?
3
Background The ZarLab does studies with the most recent events in human evolution: Now that we have modern humans, what variations have occurred in our genes since our ancient African ancestors To answer this question our group is looking at human variation to produce a genetic history of these changes
4
Why do we care? Many diseases are caused by variations that have occurred in our genetic history Better understanding of our genetic history and human variation may eventually lead to better treatment plans Personalized medicine: “The right drug, in the right dose, to the right person, at the right time.” PerkinElmer website: http://las.perkinelmer.com/content/snps/genotyping.asp#snps
5
Human Variation Modern humans share 99.9% of our DNA 0.1% account for variations between humans Of this, 80% of the variation are the result of SNPs SNP (single-nucleotide polymorphism) – position in the genome where there are two different bases present in the population. The base at a SNP on a chromosome is referred to as the “allele” A haplotype is the sequence of alleles on a genome The other 20% are from deletions or insertions on the genome PerkinElmer website: http://las.perkinelmer.com/content/snps/genotyping.asp#snps
6
Human Variation We are studying the 80% of the variations that come in the form of SNPs These SNPs are compiled into a list of SNPs which are called haplotypes Deletions and insertions are “ignored” because of the limitations of microarrays from which the data is generated
7
International HapMap Project Study done by the International HapMap Consortium “…create a public, genome-wide database of common human sequence variation…” Identified SNPs and compiled the SNP alleles into a database of haplotypes for four different populations (Phase 1) Population used were a group of 60 Mormons in Utah Have been widely studied in the past Western and Northern European descent Have very detailed records Used their chromosome 19 “A haplotype map of the human genome” by: The International HapMap Consortium. Nature. Published 27 October 2005
8
My Project Goals Reconstruct human genetic history This is a very difficult problem Sub-problem: Identify recent genetic events Make the assumption that these new genetic events are rare or very few in number Easier to classify and identify relationships when compared to older more common haplotypes These new events are important because they identify shared recent ancestry Disease causing variations could be from recent events
9
Identifying Recent Genetic Events 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombinations
10
Workflow Individual’s Frequency of Identify HaplotypesVariationEvents TTTTTTTTTTTTTTTAAAAAAAAA A AAAAAAAAAAAAAAA TTTTTTTTTTTTTTTCommonAAAAAAAAA T * AAAAAAAAAAAAAAAAAAAAAAAAA – 49% TTTTTTTTTTTTTTTTTTTTTTTTT – 48% AA AAAAAAAA AAAAAAAAAAAAAAA TTTTTTTTTTTTTTTRareAA|TTTTTTTT AAAAAAAAATTTTTTAAAAAAAAAT – 1% AATTTTTTTTTTTTTAATTTTTTTT – 1% TTTTTTTTTT TTTTTTATTTTTTTTTTTTTTATTT – 1% AAAAAAAAAAAAAAATTTTTT T TTT AAAAAAAAAAAAAAA TTTTTT A *TTT 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
11
Choosing a region size Need to pick a region size that will be large enough to pick up a lot of different variations but small enough to see what caused the variations Through numerous tests, selecting a region of 20 nucleotides and using progressively smaller regions, it was determined that a region size of 10 nucleotides was the best 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombinations
12
1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombinations Region Size 20
13
Region Size 10 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
14
Frequency of Variation Individual’s RegionHow Many Haplotype TTTTTTTTTTTTTTTTTTTTTTTTT AAAAAAAAAAAAAAAAAAAAAAAAA TTTTTTTTTTTTTTTTTTTTTTTTT AAAAAAAAAAAAAAAAAAAAAAAAA TTTTTTTTTTTTTTTTTTTTTTTTT AAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAA - 59 TTTTTTTTTTTTTTTTTTTTTTTTT TTTTTTTTTT - 58 AAAAAAAAATTTTTTAAAAAAAAAT AAAAAAAAAT - 1 AATTTTTTTTTTTTTAATTTTTTTT AATTTTTTTT - 1 TTTTTTATTTTTTTTTTTTTTATTT TTTTTTATTT - 1 AAAAAAAAAAAAAAAAAAAAAAAAA 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
15
Frequency of Variation Individual’s How ManyFrequency of HaplotypeVariation TTTTTTTTTT|TTTTT AAAAAAAAAA|AAAAA TTTTTTTTTT|TTTTT AAAAAAAAAA|AAAAA TTTTTTTTTT|TTTTT AAAAAAAAAA|AAAAAAAAAAAAAAA – 59/120~49% TTTTTTTTTT|TTTTTTTTTTTTTTT – 58/120~48% AAAAAAAAAT|TTTTTAAAAAAAAAT – 1/120~1% AATTTTTTTT|TTTTTAATTTTTTTT – 1/120~1% TTTTTTATTT|TTTTTTTTTTTATTT – 1/120~1% AAAAAAAAAA|AAAAA 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
16
Grouping Variations Classified as either common or rare haplotypes Make the assumption that new genetic events are rare or very few in number A cut off rate of 5% frequency or higher was used to separate common subsequences from rare subsequences 5% was a number that came from the International HapMap Consortium study “A haplotype map of the human genome” by: The International HapMap Consortium. Nature. Published 27 October 2005 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
17
Grouping Variations Individual’s Frequency ofGroup GenesVariation TTTTTTTTTT|TTTTT AAAAAAAAAA|AAAAA TTTTTTTTTT|TTTTT AAAAAAAAAA|AAAAACommon: TTTTTTTTTT|TTTTTAAAAAAAAAA AAAAAAAAAA|AAAAAAAAAAAAAAA – 49%TTTTTTTTTT TTTTTTTTTT|TTTTTTTTTTTTTTT – 48% AAAAAAAAAT|TTTTTAAAAAAAAAT – 1%Rare: AATTTTTTTT|TTTTTAATTTTTTTT – 1%AAAAAAAAAT TTTTTTATTT|TTTTTTTTTTTATTT – 1%AATTTTTTTT AAAAAAAAAA|AAAAATTTTTTATTT AAAAAAAAAA|AAAAA 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
18
Recent Events Make comparisons to identify two forms of variation: Point mutations Recombination events Common:Rare: AAAAAAAAAAAAAAAAAAAT TTTTTTTTTTAATTTTTTTT TTTTTTATTT 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
19
Point Mutations Individual’s Frequency of Identify HaplotypesVariationEvents TTTTTTTTTTTTTTTAAAAAAAAA A AAAAAAAAAAAAAAA TTTTTTTTTTTTTTTAAAAAAAAA T * AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AA AAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAA – 49% TTTTTTTTTTTTTTTTTTTTTTTTT – 48%AA|TTTTTTTT AAAAAAAAATTTTTTAAAAAAAAAT – 1% AATTTTTTTTTTTTTAATTTTTTTT – 1% TTTTTTTTTT TTTTTTATTTTTTTTTTTTTTATTT – 1% AAAAAAAAAAAAAAATTTTTT T TTT AAAAAAAAAAAAAAA TTTTTT A *TTT 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
20
Point Mutations Individual’s Frequency of Identify HaplotypesVariationEvents TTTTTTTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAAAAAAAAAAAAAAAAA – 49% TTTTTTTTTTTTTTTTTTTTTTTTT – 48% AAAAAAAAATTTTTTAAAAAAAAAT – 1% AATTTTTTTTTTTTTAATTTTTTTT – 1% TTTTTTATTTTTTTTTTTTTTATTT – 1% AAAAAAAAAAAAAAATTTTTT T TTT AAAAAAAAAAAAAAA TTTTTT A *TTT 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
21
Recent Events Point mutations Are found by comparing a common haplotype and with a rare haplotype A difference of one shows that a rare haplotype is a point mutation of a common haplotype Marked by a “*” next to the point mutation Common: TTTTTTTTTT TTTTTTA*TTT Rare:TTTTTTATTT 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
22
Recombination Individual’s Frequency of Identify HaplotypesVariationEvents TTTTTTTTTTTTTTTAAAAAAAAA A AAAAAAAAAAAAAAA TTTTTTTTTTTTTTTAAAAAAAAA T * AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AA AAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAA – 49% TTTTTTTTTTTTTTTTTTTTTTTTT – 48%AA|TTTTTTTT AAAAAAAAATTTTTTAAAAAAAAAT – 1% AATTTTTTTTTTTTTAATTTTTTTT – 1% TTTTTTTTTT TTTTTTATTTTTTTTTTTTTTATTT – 1% AAAAAAAAAAAAAAATTTTTT T TTT AAAAAAAAAAAAAAA TTTTTT A *TTT 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
23
Recombination Individual’s Frequency of Identify HaplotypesVariationEvents TTTTTTTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AA AAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAA – 49% TTTTTTTTTTTTTTTTTTTTTTTTT – 48%AA|TTTTTTTT AAAAAAAAATTTTTTAAAAAAAAAT – 1% AATTTTTTTTTTTTTAATTTTTTTT – 1% TTTTTTTTTT TTTTTTATTTTTTTTTTTTTTATTT – 1% AAAAAAAAAAAAAAA 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
24
Recent Events Recombination Combine portions of two common haplotypes and see if they form a rare haplotype Common:Possible Recombinations: AAAAAAAAAAAA|TTTTTTTT TTTTTTTTTTAAA|TTTTTTT AAAA|TTTTTT AAAAA|TTTTT AAAAAA|TTTT AAAAAAA|TTT AAAAAAAA|TT 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
25
Rare Mutations Marked by a “|” at the border between one haplotype and another haplotype Possible Recombinations:Actual Recombinations: AA|TTTTTTTTAA|TTTTTTTT AAA|TTTTTTT AAAA|TTTTTT AAAAA|TTTTT AAAAAA|TTTT AAAAAAA|TTT AAAAAAAA|TT 1.Select a region in a haplotype and find the frequency of variation 2.Group variations into common and rare 3.Find recent point mutations 4.Find recent recombination events
26
Sample input and output chr-haplotypes.txt: new_chr-haplotypes.txt:Indv1 TTTTTTTTTTTTTTTT T T T T T T T T TIndv1 AAAAAAAAATTTTTTA A A A A A A A A T*Indv2 AATTTTTTTTTTTTTA A|T T T T T T T TIndv2 TTTTTTATTTTTTTTT T T T T T A*T T T
27
Visualization Tool
28
Expanding to the Whole Chromosome Now that we have a way to look for variations in regions of a chromosome, we can expand the technique to look for variations in a whole chromosome We used a technique of overlapping windows AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA |AAAAAAAAAA|
29
Overlapping Windows Individual’s Frequency of Identify HaplotypesVariationEvents TTTTTTTTTTTTTTTAAAAAAAAA A AAAAAAAAAAAAAAA TTTTTTTTTTTTTTTAAAAAAAAA T * AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AA AAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAA – 49% TTTTTTTTTTTTTTTTTTTTTTTTT – 48%AA|TTTTTTTT AAAAAAAAATTTTTTAAAAAAAAAT – 1% AATTTTTTTTTTTTTAATTTTTTTT – 1% TTTTTTTTTT TTTTTTATTTTTTTTTTTTTTATTT – 1% AAAAAAAAAAAAAAATTTTTT T TTT AAAAAAAAAAAAAAA TTTTTT A *TTT
30
Overlapping Windows Individual’s Frequency of Identify HaplotypesVariationEvents TTTTTTTTTTTTTTTAAAAAAAAA A AAAAAAAAAAAAAAA TTTTTTTTTTTTTTTAAAAAAAAA T * AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAAAAAAAAAAAAAAAAA – 49% TTTTTTTTTTTTTTTTTTTTTTTTT – 48% AAAAAAAAATTTTTTAAAAAAAAAT – 1% AATTTTTTTTTTTTTAATTTTTTTT – 1% TTTTTTATTTTTTTTTTTTTTATTT – 1% AAAAAAAAAAAAAAA
31
Overlapping Recombination events that looked like point mutations Common:AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT Rare:AAAAAAAAATTTTTT First 10Slide over 5 and next 10 Common:AAAAAAAAA A Common: AAAA AAAAAA TTTT TTTTTT Rare:AAAAAAAAA T *Rare: AAAA | TTTTTT AAAAAAAAA|T*TTTTT AAAAAAAAA|TTTTTT
32
Applying to a Population’s Chromosome Now that we have a technique to look for new variations in a whole chromosome We can apply it to a population and identify regions where recent genetic events took place
33
Identified Recent Genetic Events In chromosome 19: Unique point mutations= 13723 Unique recombination events = 4065 Total unique events = 15697 Total point mutations = 46072 Total recombination events= 11381 Total number of events= 57453 Average point mutations per individual = 383 Average recombination events per individual= 94 Average events per individual = 478
34
Point Mutations Number of Events SNP Position in the Haplotype
35
Recombination Events Haplotype Number of Events SNP Position in the Haplotype
36
Point Mutations and Recombination Events Number of Events Haplotype SNP Position in the Haplotype
37
Conclusion We have developed an algorithm for identifying recent genetic events in an individual There were more point mutations identified than there were recombination events Certain regions in the genome where there were many recent genetic events and there are regions with fewrecent genetic events
38
Future Work Run the algorithm over the whole genome Extend the algorithm to multiple populations Identify recent events that are unique to a population vs. ones that are shared Identify genetic relations between common haplotypes Create a chronological order of recent events in an individual Adapt the algorithm for high-throughput sequencing data
39
UCLA ZarLab Dr. Eleazar Eskin All the lab people SoCalBSI Dr. Jamil Momand Dr. Sandra Sharp Dr. Nancy Warter-Perez Dr. Wendie Johnston Dr. Beverly Krilowicz Dr. Silvia Heubach Dr. Jennifer Faust Ronnie ChengFunded By: SoCalBSI 2009 Interns
40
The other ancestors are determined through SNP differences of 2 or more Determining ancestors
41
My Project Red line Point Mutation Blue line Ancestor to common relationship Black dashed line Haplotype resulted from cross over mutation
42
Graph Graph is generated by a program called Graphviz which is a graphical visualization program
43
Graph
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.