Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou

Slides:



Advertisements
Similar presentations
On line (DNA and amino acid) Sequence Information Lecture 7.
Advertisements

Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins iProClass Protein Knowledgebase.
HCS806 “Methods in Horticulture and Crop Science” Introduction to methods in Bioinformatics for plant science. David Francis (Coordinator) Ian Holford.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Bioinformatics What is bioinformatics? Why bioinformatics? The major molecular biology facts Brief history of bioinformatics Typical problems of bioinformatics:
How to use the web for bioinformatics Molecular Technologies Ethan Strauss X 1171
Bioinformatics David Brodin BEA core facility MOLEKYLÄRBIOLOGI MED GENETIK – BIOINFORMATIK HT -07 Course web page:
GENBANK, SWISSPROT AND OTHERS As Problem Sources for CSE 549 Andriy Tovkach Genetics.
Bioinformatics Needs for the post-genomic era Dr. Erik Bongcam-Rudloff The Linnaeus Centre for Bioinformatics.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Data-intensive Computing: Case Study Area 1: Bioinformatics B. Ramamurthy 6/17/20151.
The Cell, Central Dogma and Human Genome Project.
Integration of Bioinformatics into Inquiry Based Learning by Kathleen Gabric.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Ayesha Masrur Khan Spring Course Outline Introduction to Bioinformatics Definition of Bioinformatics and Related Fields Earliest Bioinformatics.
Application of Bioinformatics in Genetics Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Drs. Michele Tennant / & Rolando Milian Dr. Lei.
ExPASy - Expert Protein Analysis System The bioinformatics resource portal and other resources An Overview.
From T. MADHAVAN, & K.Chandrasekaran Lecturers in Zoology.. EXIT.
Login: BITseminar Pass: BITseminar2011 Login: BITseminar Pass: BITseminar2011.
Digital Image Processing Lecture3: Introduction to MATLAB.
On line (DNA and amino acid) Sequence Information
Computational Methods to study Sequencing data -Meenakshi Sharma.
Cytoscape A powerful bioinformatic tool Mathieu Michaud
Bioinformatics.
Development of Bioinformatics and its application on Biotechnology
Pathway Assignments. The assignment – Annotating Pathways KEGG Pathway Database.
NCBI’s Bioinformatics Resources Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries U.F. Genetics Institute January 2015.
Generic substitution matrix -based sequence similarity evaluation Q: M A T W L I. A: M A - W T V. Scr: 45 -?11 3 Scr: Q: M A T W L I. A: M A W.
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
High Throughput Sequence (HTS) data analysis 1.Storage and retrieving of HTS data. 2.Representation of HTS data. 3.Visualization of HTS data. 4.Discovering.
Copyright © 2009 Pearson Education, Inc. Art and Photos in PowerPoint ® Concepts of Genetics Ninth Edition Klug, Cummings, Spencer, Palladino Chapter 21.
What is Genetic Research?. Genetic Research Deals with Inherited Traits DNA Isolation Use bioinformatics to Research differences in DNA Genetic researchers.
Bioinformatics: Theory and Practice – Striking a Balance (a plea for teaching, as well as doing, Bioinformatics) Practice (Molecular Biology) Theory: Central.
Copyright © 2009 Pearson Education, Inc. Genomics, Bioinformatics, and Proteomics Chapter 21 Lecture Concepts of Genetics Tenth Edition.
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
Function preserves sequences
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
The UCSC Table Browser & Custom Tracks Advanced searching and discovery using the UCSC Table Browser and Custom Tracks Osvaldo Graña CNIO Bioinformatics.
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte.
Bioinformatics and Computational Biology
Copyright OpenHelix. No use or reproduction without express written consent1.
Bioinformatics Chem 434 Dr. Nancy Warter-Perez Computer Engineering Dr. Jamil Momand Chemistry & Biochemistry.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
Practice – file types (Cont.) Load the “Mysequence.doc” file to Webcutter using “Choose file” and then “Upload sequence file”. -Notice that the “sequence”
Copyright OpenHelix. No use or reproduction without express written consent1.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Summer Bioinformatics Workshop 2008 BLAST Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State University – Rochester Center
1 Survey of Biodata Analysis from a Data Mining Perspective Peter Bajcsy Jiawei Han Lei Liu Jiong Yang.
PROTEIN IDENTIFIER IAN ROBERTS JOSEPH INFANTI NICOLE FERRARO.
High Throughput Sequence (HTS) data analysis 1.Storage and retrieving of HTS data. 2.Representation of HTS data. 3.Visualization of HTS data. 4.Discovering.
Presenter: Bradley Green.  What is Bioinformatics?  Brief History of Bioinformatics  Development  Computer Science and Bioinformatics  Current Applications.
BLAST: Basic Local Alignment Search Tool Robert (R.J.) Sperazza BLAST is a software used to analyze genetic information It can identify existing genes.
Bioinformatics Overview
Introduction to Bioinformatics and Functional Genomics
Biological Databases By: Komal Arora.
Data-intensive Computing: Case Study Area 1: Bioinformatics
What is Bioinformatics?
Predict Protein Sequence by Fuzzy-Association Rules
Genomes and Their Evolution
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Searching the NCBI Databases
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Nancy Baker SILS Bioinformatics Seminar January 21, 2004
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Explore Evolution: Instrument for Analysis
Evolution of Genomes Chapter 21.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Lab 3 – BLAST – Directed It’s a BLAST! (too easy?)
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou

Application of Bioinformatics in Genetic Research Time and location: Monday: 12:00-12:50 in CGRC291. Wednesday: 12:00-12:50 or 11:40-12:30, CGRC-291 Fridays (11/18. 12/2): 12:00-12:50 in CGRC-391 or 11:40-12:30 in CGRC291.

Evaluation 50% classroom participation 50% homework

History of bioinformatics – sequence analysis Sequence comparison Similarity search Phylogenetic analysis Structure predication Gene prediction

Bioinformatics in the post genome era Information Representation. - many new types of data, such as Function, Location, Interaction, Regulatory pathway, Expression profile, etc. needs to be recorded Data Management - Infrastructure for inputting, managing, access and retrieval of relevant information in a “sea of databases”. Cloud computing. Systematics The opportunity provided by genome sequence and genomic / proteomic technology is matched by the challenge to bioinformatics / computational biology

Bioinformatics in the post genome era SNP and whole genome wide association studies. Genomic expression profiling (RNA and protein levels). Comparative genomics, Epigenomics … Individual genomes, epigenomes, transcriptomes. Regulatory pathway simulation – systems biology. $1,000 genome and … $500,000 analysis ?

Objectives of GMS6014 Basic skills for retrieving and storing data, using web-based applications. Ability to install and run stand alone local applications. Understanding the basis of bioinformatics applications using sequence similarity search as the example. A brief survey of available bioinformatics tools and introduction to functional genomics and systems biology.

Sequence Representation - nucleotide N G R C W T G Y C Y A G A C A T G C C C C G T T T G T For complete list, see table 2.1, Mount 2 nd Ed Or

Sequence Representation - amino acids Q: What’s the common property of these amino acids ? 1.D, E 2.I, L, V, M, F 3.A, S, P

Sequence Representation - amino acids Example: Coloring based on aa property. WDLLAQILCYALRIY WRFLATVVLETLRQY WKFLAITMCKVLKQF RCLLCNKLYYLLRKV LNRLLAELYEVLCHI LRLLQQQQMVLQRQY WDLLAQILCYALRIY WRFLATVVLETLRQY WKFLAITMCKVLKQF RCLLCNKLYYLLRKV LNRLLAELYEVLCHI LRLLQQQQMVLQRQY

Representation of sequence – sequence file format 1.) FASTA – simple and clean > gene_name, (other info) MASASASKJHKLJLKJLDSDFSF SSDSASFSFD… Practice / DIY: retrieve sequence in Fasta format and save the file in the local computer.

How to store sequence files.txt format is clean and allows down stream sequence analysis.doc or.rtf allows formatting during annotation – however, extra information are inserted thus NOT suitable for computational analysis.

Practice – file types Using Windows Explorer (with your own computer) or IE with “C:\” in the address window. Change the “Tools  Folder Options” so that the file extensions (.xxx) are revealed. Edit the downloaded sequence file in MS Word, highlight a section of the sequence with Bold font or color and save as.doc Open the.doc file in NotePad – observe the inserted characters.

Practice – file types (Cont.) Load the.doc file to Webcutter using “Browse” and then “Upload sequence file”. -Notice that the “sequence” in the sequence box are nonsense characters. Clear input; Browse and then load the.txt file. Run an analysis. Always keep you sequences in.txt file for downstream analysis.

Representation of sequence The need to include annotations and functional information with each sequence. Structured data entry GeneBank EMBL / SwissProt Observe: The difference of data structure between SwissProt, NCBI protein, and NCBI Genes.

Representation of sequence The need to represent associated info with sequence Structured data entry Specialized databases  3-d Structure  Mutation / Diseases  Protein family / Protein domain  Interaction  Pathway  ….

Representation of sequence The need to represent associated info with sequence Structured data entry Specialized databases Complex / customized data structure - Object-oriented data representation (Mount, p44-45)

XML – Extensible Markup language Define highly structured data for sharing and exchange. Observe: 1.) The differences between the XML format and the GenPept format. 2.) The differences among XML, TinySeqXML, and INSDXML.

Bioinformatics / Computational biology Bioinformatics - Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data. Computational Biology - The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems. (Working Definition of Bioinformatics and Computational Biology - July 17, 2000). NIH / BISTI

Genetic code Codon usage special code – mitochondria genes