Download presentation
Presentation is loading. Please wait.
1
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems in Genomics
2
The animal cell
3
DNA – the carrier of the genetic code
4
DNA organization – chromosomes
5
Translation of genetic information
6
DNA sequencing informatics
7
DNA organization
8
Genome annotation
9
De novo gene prediction
10
Similarity-based gene prediction
11
Gene localization
12
Genetic mapping
13
Gene function
14
Expression analysis
15
Protein structure
16
RNA structure
17
Protein structure prediction
18
RNA structure prediction
19
DNA evolution
20
Evolution of chromosome organization
21
Evolution of gene structure
22
Evolution of DNA sequence
23
Comparative genomics
24
Phylogenetics
25
Mechanisms of molecular evolution
26
Sequence variations Human Genome Project produced a reference genome sequence that is 99.9% common to each human being sequence variations make our genetic makeup unique SNP Single-nucleotide polymorphisms (SNPs) are most abundant, but other types of variations exist and are important
27
Why do we care about variations? phenotypic differences demographic history inherited diseases
28
How do we find polymorphisms? look at multiple sequences from the same genome region diverse sequence resources can be used EST WGS BAC diversion: sequencing informatics
29
SNP discovery -- Methods Sequence clustering Cluster refinement Multiple alignment SNP detection
30
SNP discovery – Computer tools
31
>CloneX ACGTTGCAACGT GTCAATGCTGCA >CloneY ACGTTGCAACGT GTCAATGCTGCA ACCTAGGAGACTGAACTTACTG ACCTAGGAGACCGAACTTACTG ~ 30,000 clones 25,901 clones (7,122 finished, 18,779 draft with basequality values) 21,020 clone overlaps (124,356 fragment overlaps) 507,152 high-quality candidate SNPs (validation rate 83-96%) Marth et al., Nature Genetics 2001 SNP discovery – Mining Projects
32
SNP databases and characteristics access to variation data SNP properties reliability of information characterizing known polymorphic sites in sample collections – genotyping
33
Where do variations come from? sequence variations are the result of mutation events TAAAAAT TAACAAT TAAAAAT TAACAAT TAAAAATTAACAAT TAAAAAT MRCA mutations are propagated down through generations
34
Mutation rate accgttatgtaga accgctatgtaga MRCA actgttatgtaga accgctatataga MRCA higher mutation rate (µ) gives rise to more SNPS
35
Recombination accgttatgtaga
36
Demographic history small (effective) population size N large (effective) population size N different world populations have varying long-term effective population sizes (e.g. African N is larger than European)
37
Modeling past present stationaryexpansioncollapse MD (simulation) AFS (direct form) history bottleneck
38
Ancestral inference bottleneck modest but uninterrupted expansion
39
The signatures of selection selective mutations influence the genealogy itself; in the case of neutral mutations the processes of mutation and genealogy are decoupled
40
Association and haplotype structure “linkage disequilibrium” “haplotype blocks”
41
Computer simulations: the Coalescent
42
Medical utility? clinical phenotype molecular markers ? functional understanding
43
Mapping disease-causing loci genetic linkage association between allele and phenotype
44
Forensic applications
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.