Presentation is loading. Please wait.

Presentation is loading. Please wait.

Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.

Similar presentations


Presentation on theme: "Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems."— Presentation transcript:

1 Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College marth@bc.edu BI820 – Seminar in Quantitative and Computational Problems in Genomics

2 The animal cell

3 DNA – the carrier of the genetic code

4 DNA organization – chromosomes

5 Translation of genetic information

6 DNA sequencing informatics

7 DNA organization

8 Genome annotation

9 De novo gene prediction

10 Similarity-based gene prediction

11 Gene localization

12 Genetic mapping

13 Gene function

14 Expression analysis

15 Protein structure

16 RNA structure

17 Protein structure prediction

18 RNA structure prediction

19 DNA evolution

20 Evolution of chromosome organization

21 Evolution of gene structure

22 Evolution of DNA sequence

23 Comparative genomics

24 Phylogenetics

25 Mechanisms of molecular evolution

26 Sequence variations Human Genome Project produced a reference genome sequence that is 99.9% common to each human being sequence variations make our genetic makeup unique SNP Single-nucleotide polymorphisms (SNPs) are most abundant, but other types of variations exist and are important

27 Why do we care about variations? phenotypic differences demographic history inherited diseases

28 How do we find polymorphisms? look at multiple sequences from the same genome region diverse sequence resources can be used EST WGS BAC diversion: sequencing informatics

29 SNP discovery -- Methods Sequence clustering Cluster refinement Multiple alignment SNP detection

30 SNP discovery – Computer tools

31 >CloneX ACGTTGCAACGT GTCAATGCTGCA >CloneY ACGTTGCAACGT GTCAATGCTGCA ACCTAGGAGACTGAACTTACTG ACCTAGGAGACCGAACTTACTG ~ 30,000 clones 25,901 clones (7,122 finished, 18,779 draft with basequality values) 21,020 clone overlaps (124,356 fragment overlaps) 507,152 high-quality candidate SNPs (validation rate 83-96%) Marth et al., Nature Genetics 2001 SNP discovery – Mining Projects

32 SNP databases and characteristics access to variation data SNP properties reliability of information characterizing known polymorphic sites in sample collections – genotyping

33 Where do variations come from? sequence variations are the result of mutation events TAAAAAT TAACAAT TAAAAAT TAACAAT TAAAAATTAACAAT TAAAAAT MRCA mutations are propagated down through generations

34 Mutation rate accgttatgtaga accgctatgtaga MRCA actgttatgtaga accgctatataga MRCA higher mutation rate (µ) gives rise to more SNPS

35 Recombination accgttatgtaga

36 Demographic history small (effective) population size N large (effective) population size N different world populations have varying long-term effective population sizes (e.g. African N is larger than European)

37 Modeling past present stationaryexpansioncollapse MD (simulation) AFS (direct form) history bottleneck

38 Ancestral inference bottleneck modest but uninterrupted expansion

39 The signatures of selection selective mutations influence the genealogy itself; in the case of neutral mutations the processes of mutation and genealogy are decoupled

40 Association and haplotype structure “linkage disequilibrium” “haplotype blocks”

41 Computer simulations: the Coalescent

42 Medical utility? clinical phenotype molecular markers ? functional understanding

43 Mapping disease-causing loci genetic linkage association between allele and phenotype

44 Forensic applications


Download ppt "Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems."

Similar presentations


Ads by Google