Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467.

Slides:



Advertisements
Similar presentations
Combinatorial Algorithms for Haplotype Inference Pure Parsimony Dan Gusfield.
Advertisements

Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
Chapter 23: The Evolution of Populations
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
MALD Mapping by Admixture Linkage Disequilibrium.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Atelier INSERM – La Londe Les Maures – Mai 2004
Signatures of Selection
Forward Genealogical Simulations Assumptions:1) Fixed population size 2) Fixed mating time Step #1:The mating process: For a fixed population size N, there.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
A coalescent computational platform for tagging marker selection for clinical studies Gabor T. Marth Department of Biology, Boston College
Computational Tools for Finding and Interpreting Genetic Variations Gabor T. Marth Department of Biology, Boston College
A coalescent computational platform to predict strength of association for clinical samples Gabor T. Marth Department of Biology, Boston College
Molecular Evolution with an emphasis on substitution rates Gavin JD Smith State Key Laboratory of Emerging Infectious Diseases & Department of Microbiology.
The informatics of SNPs and haplotypes Gabor T. Marth Department of Biology, Boston College Cold Spring Harbor Laboratory Advanced Bioinformatics.
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary, May 2006.
Genome Annotation and the landscape of the Human Genome Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
CSE 291: Advanced Topics in Computational Biology Vineet Bafna/Pavel Pevzner
The informatics of SNPs and haplotypes Gabor T. Marth Department of Biology, Boston College Cold Spring Harbor Laboratory Advanced Bioinformatics.
Lecture X.X1. 2 The informatics of SNPs and Haplotypes Gabor T. Marth Department of Biology, Boston College
Polymorphism discovery informatics Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA
Computational Molecular Biology Biochem 218 – BioMedical Informatics Simple Nucleotide.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Hidenki Innan and Yuseob Kim Pattern of Polymorphism After Strong Artificial Selection in a Domestication Event Hidenki Innan and Yuseob Kim A Summary.
Haplotype Blocks An Overview A. Polanski Department of Statistics Rice University.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
CS177 Lecture 10 SNPs and Human Genetic Variation
Population genetics. Population genetics concerns the study of genetic variation and change within a population. While for evolving species there is no.
SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.
Gene Hunting: Linkage and Association
Deviations from HWE I. Mutation II. Migration III. Non-Random Mating IV. Genetic Drift A. Sampling Error.
A coalescent computational platform to predict strength of association for clinical samples Gabor T. Marth Department of Biology, Boston College
Models of Molecular Evolution III Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 7.5 – 7.8.
Genes in human populations n Population genetics: focus on allele frequencies (the “gene pool” = all the gametes in a big pot!) n Hardy-Weinberg calculations.
INTRODUCTION TO ASSOCIATION MAPPING
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
Lecture 7.01 The informatics of SNPs and haplotypes Gabor T. Marth Department of Biology, Boston College CGDN Bioinformatics Workshop June.
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
NEW TOPIC: MOLECULAR EVOLUTION.
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Variant calling: number of individuals vs. depth of read coverage Gabor T. Marth Boston College Biology Department 1000 Genomes Meeting Cold Spring Harbor.
Computational Biology and Genomics at Boston College Biology Gabor T. Marth Department of Biology, Boston College
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College
Single Nucleotide Polymorphisms (SNPs) By Amira Jhelum Rahul Shweta.
A coalescent computational platform to predict strength of association for clinical samples Gabor T. Marth Department of Biology, Boston College
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Evolution and Population Genetics
Common variation, GWAS & PLINK
MULTIPLE GENES AND QUANTITATIVE TRAITS
Population genetics Dr Gavin Band
Population Genetics As we all have an interest in genomic epidemiology we are likely all either in the process of sampling and ananlysising genetic data.
Patterns of Linkage Disequilibrium in the Human Genome
Statistical Modeling of Ancestral Processes
MULTIPLE GENES AND QUANTITATIVE TRAITS
The ‘V’ in the Tajima D equation is:
The Allele Frequency Spectrum in Genome-Wide Human Variation Data Reveals Signals of Differential Demographic History in Three World Populations Gabor.
BI820 – Seminar in Quantitative and Computational Problems in Genomics
The Evolution of Populations
Incorporating changing population size into the coalescent
Medical genomics BI420 Department of Biology, Boston College
Medical genomics BI420 Department of Biology, Boston College
Research for medical discovery at the Computational Genomics Laboratory at Boston College Biology Gabor T. Marth Department of Biology, Boston College.
Evolution of Populations
Presentation transcript:

Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Human variation structure is heterogeneous chromosomal averages polymorphism density along chromosomes

Heterogeneity at the level of distributions “sparse” “dense” marker density “rare” “common” allele frequency

What explains nucleotide diversity? G+C nucleotide content CpG di-nucleotide content recombination rate functional constraints 3’ UTR5.00 x ’ UTR4.95 x Exon, overall4.20 x Exon, coding3.77 x synonymous 366 / 653 non-synonymous287 / 653 Variance is so high that these quantities are poor predictors of nucleotide diversity in local regions hence random processes are likely to govern the basic shape of the genome variation landscape  (random) genetic drift

Components of drift: Genealogy present generation randomly mating population, genealogy evolves in a non- deterministic fashion

Components of drift: Mutation mutation randomly “drift”: die out, go to higher frequency or get fixed

Modulators: Changing population size mutation randomly “drift”: die out, go to higher frequency or get fixed genetic bottleneck

Modulators: Population subdivision subdivision subdivision promotes private polymorphisms, and skews allele frequency

Modulators: Recombination accgttatgcaga acagttatgtaga acagttatgcaga accgttatgtaga accgttatgcagaacagttatgtaga recombination different nucleotide sites within the same DNA segment no longer share the same genealogy

Modulators: Natural selection negative (purifying) selection positive selection the genealogy is no longer independent of (and hence cannot be decoupled from) the mutation process

Modeling ancestral processes “forward simulations” the “Coalescent” process By focusing on a small sample, complexity of the relevant part of the ancestral process is greatly reduced. There are, however, limitations.

Inferences from variation data larger population size (N) -> more mutations -> higher diversity (θ) larger mutation rate (μ) -> more mutations -> higher diversity (θ) higher diversity -> larger population size OR higher mutation rate (θ = 4Nμ)

Ancestral inference: modeling past present stationaryexpansioncollapse MD (simulation) AFS (direct form) history bottleneck

Ancestral inference: model fitting bottleneck modest but uninterrupted expansion

Allelic association accgttatgcaga acagttatgtaga acagttatgcaga accgttatgtaga possible allele combinations (2-marker haplotypes) higher recombination rate (r)

Allelic association: LD measure of allelic association: “linkage disequilibrium (LD)”

Haplotype structure “haplotype block”