Presentation is loading. Please wait.

Presentation is loading. Please wait.

Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,

Similar presentations


Presentation on theme: "Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,"— Presentation transcript:

1 Biological Databases Biology outside the lab

2 Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological information generated by the scientific community. This deluge of genomic information has, in turn, led to an absolute requirement for computerized databases to store, organize, and index the data and for specialized tools to view and analyze the data.

3 Information flux from data to decision Biology, Chemistry and Pharmaceutical research generate an huge amount of data. Information analysis rate is smaller than data production. Human Genome progect: 22.1 bilion bases sequenced but … what we do really know about it?

4 Bioinformatics - Building and managing of biological databases (nucleotides, proteins, structures, small molecules, pathways, literature, …) - Data mining and data analysis (Computational Biology) - protein modelling ab initio – Homology modelling – simulations (Molecular Modeling)

5 Literature databases http://www.ncbi.nlm.nih.gov/

6 Nucleotide databases

7 Protein databases Uniprot databases: - Swiss-prot: provide a high level of annotation, minimal level of redundancy and high level of integration with other databases - TrEMBL: a computer-annotated supplement of Swiss-Prot that contains all the translations of EMBL nucleotide sequence entries not yet integrated in Swiss- Prot. NCBI protein database (meta-database containing sequences from Uniprot entries, PDB derived sequences and translation from predicted ORF in genebank)

8 Structural Database Protein structures obtained by crystallography or NMR are stored in PDB.

9 Microarray Databases GEOminibus SMD Stanford Microarray Database Gene expression databases provides rough data of microarray expression. Data originated by different experiments can be merged to obtain previously unidentified results.

10 EST Databases EST: Expressed Sequence Tags 5’ EST : These regions tend to be conserved across species and do not change much within a gene family 3’ EST: Because these ESTs are generated from the 3' end of a transcript, they are likely to fall within non-coding, or untranslated regions (UTRs), and therefore tend to exhibit less cross-species conservation than do coding sequences. Sequence Tagged Site (STS): help to locate a gene in the genome. 3’EST are a good source of STS Available DBs: Genebank – dbEST – Unigene

11 Tools ORF finder Blast Multiple alignment Conserved Domain Identification Secondary structure and Folding Prediction

12 Example 1 A recombinant plasmid containing clone shows an interesting phenotype sequencing -Phylogenetically similar sequences - Conserved Domain Rough sequence ORF identification In-frame sequence Blast

13 CDS

14 Example 2

15

16

17 Exampe 2

18 Example 2 Tune the method a)Increase window size in evaluating score - increase local information integrating “environmental” data - 2 residues window -> 2 frames 3 residues window -> 3 frames …. b) Use degenerate matching methods (based on size, polarity, h-bond behavior, …)


Download ppt "Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,"

Similar presentations


Ads by Google