Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.

Similar presentations


Presentation on theme: "Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to."— Presentation transcript:

1 Sequence Analysis

2 Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to determine what is known about this sequence biologically?

3 Gene structure Genes contain introns and exons. Introns are transcribed into RNA but are removed, ie. the are non-coding regions. Exons are the coding regions. Present in mRNA. mRNA Exon1 Intron1 E2E3 I2

4 Types of DNA sequence Genomic – Contains both genes and non-genic regions – Genes have both intron and exons cDNA (complimentary DNA) – Sequence corresponds to genes that are expressed. – Sequence contain only the

5 What could you do with genomic sequence? What about with cDNA sequence?

6 What is an EST? Expressed sequence tag. Part or all of a cDNA that has been sequenced.

7 What is NCBI? National Center for Biotechnology National Library of Medicine, NIH Created in 1988 to develop information systems for molecular biology. Provides data retrieval systems and computational resources.

8 Database Resources Database retrieval tools BLAST family of sequence-similarity search programs. Resources for gene-level sequences Resources for genome-scale analysis

9 Database Retrieval Tools Entrez-for DNA and protein sequences PubMed Central-for literature Taxonomy-organisms and associated sequences LocusLinks-provides links from sequence info to map and other information.

10 BLAST family Basic local alignment search tool Sequence similarity search against various databases in GenBank

11 BLAST Pairwise alignment. Each alignment has a statistical significance (e-value). Accounts for amino acid sequence Outputs a list of matches including start, stop, score, and e-value.

12 5 BLAST Programs BLASTN – Nucleotide vs. Nucleotide BLASTP – Protein vs. Protein BLASTX – Protein vs. nucleotide translation TBLASTN – Nucleotide translation vs. Protein TBLASTX – Nucleotide translation vs. nucleotide translation.

13 Genome-Scale Analysis Entrez Genomes – taxonomic, genome or chromosome view of the current sequence data for an organism. COGs – List of orthologous protein groups from completely sequenced organisms. Retroviroal genotyping tools – Important in viral genetic diversity, tracking outbreaks, and vaccine development.

14 Genome-Scale Analysis Eukaryotic Genomic Resources – location of Plant Genomes Central with information from various plant genome projects. Map Viewer – Displays genome assemblies using chromosome map views.

15 Genome-Scale Analysis Human-Mouse Homology Maps – List of genes in homologous segments. Cancer Chromosome Aberration Project – List of recurrent chromosome aberrations associated with cancer.

16 Gene Expression/Phenotype OMIM – Catalog of human genes and genetic disorders including phenotypes and polymorphism information. Gene Expression Omnibus (GEO) – Data repository and retrieval system for expression data from all sources.

17 MMDB, CDDB, CDART Molecular Modeling Database Conserved Domain Database Conserved Domain Architecture Retrieval Tool – Identifies conserved domains and displays their structure.

18 Sequence Analysis References Korf, Yandell, and Bedell. 2003. An Essential Guide to the Basic Local Alignment Search Tool: BLAST. O’Reilly & Associates, Sebastopol, CA. Markel and Leon. 2003. Sequence Analysis in a Nutshell: A Guide to Common Tools and Databases. O’Reilly & Associates, Sebastopol, CA.

19 Sequence Analysis References Baxevanis and Ouellette. 2001. Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins. Wiley Interscience, New York. Mount. 2000. Bioinformatics: Sequence and Genome Analysis. Cold Spring Harbor Laboratory, New York.

20 What can you do with the sequence? Gene prediction Motif identification Promoter identification Survey gene expression across tissues Full length gene isolation Identify mutations (SNP, InDel)

21 InDel Insertion/Deletions Usually small sized Can use the same protocols and equipment as for SSR analysis or can run separation on a capillary system using fluorecently labelly material.

22 Single Nucleotide Polymorphism SNP Single base-pair change in the DNA sequence of two alleles. Best done with high quality sequence and confirmed in multiple lines or multiple experiments.

23 SNP popularity Difficult to identify human disease loci by other methods. Most abundant class of polymorphisms in many species. Ease of use for genotyping, ie. they can be automated easily.

24 What can you do with ESTs? Gene expression analysis Colinearity studies Protein prediction SNP identification Genetic mapping

25 Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to determine what is known about this sequence biologically?

26 Using adh as an example Find adh1 sequence in corn. Find related sequences. Determine its function in corn. Find adh in human. Find related sequences. Determine its function in human.


Download ppt "Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to."

Similar presentations


Ads by Google