Presentation is loading. Please wait.

Presentation is loading. Please wait.

Algorithms for Biological Sequence Analysis

Similar presentations


Presentation on theme: "Algorithms for Biological Sequence Analysis"— Presentation transcript:

1 Algorithms for Biological Sequence Analysis
Kun-Mao Chao (趙坤茂) Department of Computer Science and Information Engineering National Taiwan University, Taiwan Date: Feb. 22, 2011 WWW:

2 About this course Course: Algorithms for biological sequence analysis
Some basic knowledge on algorithm development and program design is required. We will be focused on the sequence-related algorithmic problems. Genomic sequences are our main target. The oldest language The largest program Spring semester, 2011 9: :10 Tuesday, 107 CSIE Building 3 credits Web site:

3 Coursework: Homework assignments and Class participation (10%)
Two midterm exams (70%; 35% each): April 12, 2011 (tentatively) May 24, 2011 (tentatively) Oral presentation of selected papers (20%)

4 Outlines Part I: Sequence Homology Part II: Sequence Composition
Introduction to basic algorithmic strategies Pairwise sequence alignment Multiple sequence alignment Chaining algorithms for genomic sequence analysis Suboptimal alignment Comparative genomics Compressed / constrained sequence comparison Hidden Markov models (the Viterbi algorithm et al.) Part II: Sequence Composition Maximum-sum and maximum-density segments SNP and haplotype data analysis Approximate gapped palindrome Genome annotation Other advanced topics

5 A Brief History of Genetics
1859 Charles Darwin published “The Origin of Species.” 1865 Genes are particular factors. [Gregor Mendel] 1869 Discovery of nucleic acid [Friedrich Miescher] 1903 Chromosomes are hereditary units. [Walter Sutton] 1910 Genes lie on chromosomes. [Thomas Hunt Morgan] 1913 Chromosomes are linear arrays of genes. [Alfred Sturtevant]

6 A Brief History of Genetics (cont’d)
1931 Recombination occurs by crossing over. [Harriet Creighton and Barbara McClintock] 1944 DNA is the genetic material. [Oswald Avery, Colin McLeod and Maclyn McCarty] 1953 DNA is a double helix. [James Watson and Francis Crick] Genetic code is triplet. [Marshall Nirenberg, Har Gobind Khorana, Sydney Brenner & Francis Crick] 1977 DNA was sequenced for the first time. [Fred Sanger, Walter Gilbert, and Allan Maxam] 21th Century: Many genomes completely sequenced

7 Milestones of Bioinformatics
1962 Pauling's theory of molecular evolution 1965 Margaret Dayhoff's Atlas of Protein Sequences 1970 Needleman-Wunsch algorithm 1977 DNA sequencing and software to analyze it (Staden) 1981 Smith-Waterman algorithm developed 1981 The concept of a sequence motif (Doolittle) 1982 GenBank Release 3 made public 1982 Phage lambda genome sequenced

8 Milestones of Bioinformatics (cont’d)
1983 Sequence database searching algorithm (Wilbur-Lipman) 1985 FASTP/FASTN: fast sequence similarity searching 1988 National Center for Biotechnology Information (NCBI) created at NIH/NLM 1988 EMBnet network for database distribution 1990 BLAST: fast sequence similarity searching 1991 EST: expressed sequence tag sequencing 1993 Sanger Centre, Hinxton, UK 1994 EMBL European Bioinformatics Institute, Hinxton, UK

9 Milestones of Bioinformatics (cont’d)
1995 First bacterial genomes completely sequenced 1996 Yeast genome completely sequenced 1997 PSI-BLAST 1998 Worm (multicellular) genome completely sequenced 1999 Fly genome completely sequenced

10 Milestones of Bioinformatics (cont’d)
Human Genome Project ( ) Mouse 2002 Rat 2004 Chimpanzee 2005 Completed Genomes

11 Chimpanzee Genome

12 The Primate Family Tree
Source: Nature

13 A Sequence Analysis Book Published by Springer in 2009 (ISBN 978-1848003194)
Sequence Comparison: Theory and Methods by Kun-Mao Chao and Louxin Zhang


Download ppt "Algorithms for Biological Sequence Analysis"

Similar presentations


Ads by Google