Master’s course Bioinformatics Data Analysis and Tools Lecture 1: Introduction Centre for Integrative Bioinformatics FEW/FALW C E N T.

Slides:



Advertisements
Similar presentations
Approaches, Tools, and Applications Islam A. El-Shaarawy Shoubra Faculty of Eng.
Advertisements

Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
School of Computer Engineering Master of Science (Bioinformatics) A/P Kwoh Chee Keong 2009 presented by.
Bioinformatics For MNW 2 nd Year Jaap Heringa FEW/FALW Integrative Bioinformatics Institute VU (IBIVU) Tel ,
Bioinformatics at IU - Ketan Mane. Bioinformatics at IU What is Bioinformatics? Bioinformatics is the study of the inherent structure of biological information.
Bioinformatics Master’s Course Genome Analysis ( Integrative Bioinformatics ) Lecture 1: Introduction Centre for Integrative Bioinformatics VU (IBIVU)
Bioinformatics Needs for the post-genomic era Dr. Erik Bongcam-Rudloff The Linnaeus Centre for Bioinformatics.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Bioinformatics Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 1 Introduction Aleppo University Faculty of technical engineering.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Using Bioinformatics to Make the Bio- Math Connection The Confessions of a Biology Teacher.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Master’s course Bioinformatics Data Analysis and Tools Lecture 1: Introduction Centre for Integrative Bioinformatics FEW/FALW C E N T.
Introduction to BioInformatics GCB/CIS535
Scientific Data Mining: Emerging Developments and Challenges F. Seillier-Moiseiwitsch Bioinformatics Research Center Department of Mathematics and Statistics.
The bioinformatics of biological processes The challenge of temporal data Per J. Kraulis CMCM, Tartu University.
Genetics: From Genes to Genomes
Ayesha Masrur Khan Spring Course Outline Introduction to Bioinformatics Definition of Bioinformatics and Related Fields Earliest Bioinformatics.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Medical Informatics Basics
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
Meer perspectief Master’s programme in Bioinformatics Bio-tools for the Future International 2-year Bioinformatics Master’s Programme.
Bioinformatics for biomedicine Protein domains and 3D structure Lecture 4, Per Kraulis
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
Beyond the Human Genome Project Future goals and projects based on findings from the HGP.
Introduction to Pharmacoinformatics
GTL Facilities Computing Infrastructure for 21 st Century Systems Biology Ed Uberbacher ORNL & Mike Colvin LLNL.
Master’s course Bioinformatics Data Analysis and Tools Lecture 1: Introduction Centre for Integrative Bioinformatics FEW/FALW
High-throughput Biological Data The data deluge and bioinformatics algorithms Introduction to bioinformatics 2005 Lecture 3.
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
C E N T R F O I G A V B M S U 2MNW/3I/3AI/3PHAR bachelor course Introduction to Bioinformatics Lecture 1: Introduction Centre for Integrative Bioinformatics.
GTL User Facilities Facility IV: Analysis and Modeling of Cellular Systems Jim K. Fredrickson.
CSCI 6900/4900 Special Topics in Computer Science Automata and Formal Grammars for Bioinformatics Bioinformatics problems sequence comparison pattern/structure.
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
Bioinformatics For MNW 2 nd Year Jaap Heringa FEW/FALW Centre for Integrative Bioinformatics VU (IBIVU) Tel ,
Integrating the Bioinformatic Technology Group into your research programme Introduction People and Skills Examples Integrating the BTG Contacts BHRC Away.
Harbin Institute of Technology Computer Science and Bioinformatics Wang Yadong Second US-China Computer Science Leadership Summit.
1 Departament of Bioengineering, University of California 2 Harvard Medical School Department of Genetics Metabolic Flux Balance Analysis and the in Silico.
Systems Biology ___ Toward System-level Understanding of Biological Systems Hou-Haifeng.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
DIALing the IBIVU Vrije Universiteit Amsterdam Jaap Heringa Centre for Integrative Bioinformatics VU (IBIVU) Faculty of Sciences / Faculty of Earth and.
Course Sequence Analysis for Bioinformatics Master’s Bart van Houte, Radek Szklarczyk, Victor Simossis, Jens Kleinjung, Jaap Heringa
Overview of Bioinformatics 1 Module Denis Manley..
AdvancedBioinformatics Biostatistics & Medical Informatics 776 Computer Sciences 776 Spring 2002 Mark Craven Dept. of Biostatistics & Medical Informatics.
Microarrays.
Proteomics Session 1 Introduction. Some basic concepts in biology and biochemistry.
Genes and Genomic Datasets. DNA compositional biases Base composition of genomes: E. coli: 25% A, 25% C, 25% G, 25% T P. falciparum (Malaria parasite):
Information Technology in the Natural Sciences Biology – Chemistry – Physics.
Central dogma: the story of life RNA DNA Protein.
EB3233 Bioinformatics Introduction to Bioinformatics.
Bioinformatics and Computational Biology
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
داده های عظیم در دوران پساژنوم Big Data in Post Genome Era مهدی صادقی پژوهشگاه ملی مهندسی ژنتیک و زیست فناوری پژوهشکده علوم زیستی، پژوهشگاه دانش های بنیادی.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Clinical Research Informatics [CRI]. Informatics, defined generally as the intersection of information and computer science with a health-related discipline,
Dynamical Modeling in Biology: a semiotic perspective Junior Barrera BIOINFO-USP.
High throughput biology data management and data intensive computing drivers George Michaels.
Effect of Alcohol on Brain Development NormalFetal Alcohol Syndrome.
생물정보학 Bioinformatics.
High-throughput Biological Data The data deluge
C E N T R F O I G A V B M S U 2MNW/3I/3AI/3PHAR bachelor course Introduction to Bioinformatics Lecture 1: Introduction Centre for Integrative Bioinformatics.
9 Future Challenges for Bioinformatics
Bioinformatics For MNW 2nd Year
BIOINFORMATICS Summary
Introduction to Bioinformatic
Introduction to Bioinformatics
Presentation transcript:

Master’s course Bioinformatics Data Analysis and Tools Lecture 1: Introduction Centre for Integrative Bioinformatics FEW/FALW C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E

Course objectives There are two extremes in bioinformatics work –Tool users (biologists): know how to press the buttons and know the biology but have no clue what happens inside the program –Tool shapers (informaticians): know the algorithms and how the tool works but have no clue about the biology Both extremes are dangerous, need a breed that can do both

Course objectives How do you become a good bioinformatics problem solver? –You need to know basic analysis and data mining modes –You need to know some important backgrounds of analysis and prediction techniques (e.g. statistical thermodynamics) –You need to have knowledge of what has been done and what can be done (and what not) Is this enough to become a creative tool developer? –Need to like doing it –Experience helps

Contents (tentative dates) DateLecture TitleLecturer 1 [wk 19] 07/05/07IntroductionJaap Heringa 2 [wk 19]10/05/07Microarray data analysisJaap Heringa 3 [wk 20]14/05/07Molecular simulations & sampling techniques Anton Feenstra 4 [wk 21] 22/05/07Introduction to Statistical Thermodynamics IAnton Feenstra 5 [wk 21] 24/05/07Introduction to Statistical Thermodynamics II Anton Feenstra 6[wk 23] 05/06/07Machine learningElena Marchiori 7[wk 23] 07/06/07Clustering algorithmsBart van Houte 8[wk 24] 11/06/07Support vector machines and feature selection in bioinformaticsElena Marchiori 9[wk 24] 12/06/07Databases and parsingSandra Smit 10[wk 24] 14/06/07OntologiesFrank van Harmelen 11[wk 25] 18/06/07 Benchmarking, parallelisation & grid computing Thilo Kielmann 12[wk 25] 19/06/07Method development I: Protein domain prediction Jaap Heringa 13[wk 25] 21/06/07Method development II Jaap Heringa

At the end of this course… You will have seen a couple of algorithmic examples You will have got an idea about methods used in the field You will have a firm basis of the physics and thermodynamics behind a lot of processes and methods You will have an idea of and some experience as to what it takes to shape a bioinformatics tool

Bioinformatics “Studying informatic processes in biological systems” (Hogeweg) Applying algorithms and mathematical formalisms to biology (genomics) “Information technology applied to the management and analysis of biological data” (Attwood and Parry-Smith)

This course General theory of crucial algorithms (GA, NN, HMM, SVM, etc..) Method examples Research projects within own group –Repeats –Domain boundary prediction Physical basis of biological processes and of (stochastic) tools

Bioinformatics Large - external (integrative)ScienceHuman Planetary ScienceCultural Anthropology Population Biology Sociology SociobiologyPsychology Systems Biology Biology Medicine Molecular Biology Chemistry Physics Small – internal (individual) Bioinformatics

Genomic Data Sources DNA/protein sequence Expression (microarray) Proteome (xray, NMR, mass spectrometry, PPI) Metabolome Physiome (spatial, temporal) Integrative bioinformatics

Protein structural data explosion Protein Data Bank (PDB): Structures (6 March 2001) x-ray crystallography, 1810 NMR, 278 theoretical models, others...

Mathematics Statistics Computer Science Informatics Biology Molecular biology Medicine Chemistry Physics Bioinformatics Bioinformatics inspiration and cross-fertilisation

Algorithms in bioinformatics string algorithms dynamic programming machine learning (NN, k-NN, SVM, GA,..) Markov chain models hidden Markov models Markov Chain Monte Carlo (MCMC) algorithms stochastic context free grammars EM algorithms Gibbs sampling clustering tree algorithms (suffix trees) graph algorithms text analysis hybrid/combinatorial techniques and more…

Joint international programming initiatives Bioperl Biopython BioTcl BioJava

Integrative VU Studying informational processes at biological system level From gene sequence to intercellular processes Computers necessary We have biology, statistics, computational intelligence (AI), HTC,.. VUMC: microarray facility, cancer centre, translational medicine Enabling technology: new glue to integrate New integrative algorithms Goals: understanding cellular networks in terms of genomes; fighting disease (VUMC)

VU Progression: DNA: gene prediction, predicting regulatory elements, alternative splicing mRNA expression Proteins: (multiple) sequence alignment, docking, domain prediction, PPI Metabolic pathways: metabolic control Cell-cell communication

Fold recognition by threading: THREADER and GenTHREADER Query sequence Compatibility scores Fold 1 Fold 2 Fold 3 Fold N

Polutant recognition by microarray mapping: Compatibility scores Cond. 1 Cond. 2 Cond. 3 Cond. N Contaminant 1 Contaminant 2 Contaminant 3 Contaminant N Query array

ENFIN WP4 Functional threading From sequence to function –Multiple alignment –Secondary structure prediction, Solvation prediction, Conservation patterns, Loop enumeration

ENFIN WP4 Functional threading From sequence to function –Multiple alignment –Secondary structure prediction, Solvation prediction, Conservation patterns, Loop enumeration D H S StructFunc DHS DB of active site descrip tors

ENFIN WP5 - BioRange (Anton Feenstra) Protein-protein interaction prediction Mesoscopic modelling Soft-core Molecular Dynamics (MD) –Fuzzy residues –Fuzzy (surface) locations

ENFIN WP6 Silicon Cell –Database of fully parametrized pathway model (differential equations) solver Jacky Snoep (Stellenbosch, VU/IBIVU) Hans Westerhoff (VU, Manchester)

Where are important new questions?

New neighbouring disciplines Translational Medicine A branch of medical research that attempts to more directly connect basic research to patient care. Translational medicine is growing in importance in the healthcare industry, and is a term whose precise definition is in flux. In particular, in drug discovery and development, translational medicine typically refers to the "translation" of basic research into real therapies for real patients. The emphasis is on the linkage between the laboratory and the patient's bedside, without a real disconnect. This is often called the "bench to bedside" definition. Computational Systems Biology Computational systems biology aims to develop and use efficient algorithms, data structures and communication tools to orchestrate the integration of large quantities of biological data with the goal of modeling dynamic characteristics of a biological system. Modeled quantities may include steady-state metabolic flux or the time-dependent response of signaling networks. Algorithmic methods used include related topics such as optimization, network analysis, graph theory, linear programming, grid computing, flux balance analysis, sensitivity analysis, dynamic modeling, and others. Neuro-informatics Neuroinformatics combines neuroscience and informatics research to develop and apply the advanced tools and approaches that are essential for major advances in understanding the structure and function of the brain

Translational Medicine “From bench to bed side” Genomics data to patient data Integration

Natural progression of a gene

TERTIARY STRUCTURE (fold) Genome Expressome Proteome Metabolome Functional Genomics From gene to function

Systems Biology is the study of the interactions between the components of a biological system, and how these interactions give rise to the function and behaviour of that system (for example, the enzymes and metabolites in a metabolic pathway). The aim is to quantitatively understand the system and to be able to predict the system’s time processes the interactions are nonlinear the interactions give rise to emergent properties, i.e. properties that cannot be explained by the components in the system

Systems Biology understanding is often achieved through modeling and simulation of the system’s components and interactions. Many times, the ‘four Ms’ cycle is adopted: Measuring Mining Modeling Manipulating

A system response Apoptosis: programmed cell deathNecrosis: accidental cell death

Neuroinformatics Understanding the human nervous system is one of the greatest challenges of 21st century science. Its abilities dwarf any man-made system - perception, decision-making, cognition and reasoning. Neuroinformatics spans many scientific disciplines - from molecular biology to anthropology.

Neuroinformatics Main research question: How does the brain and nervous system work? Main research activity: gathering neuroscience data, knowledge and developing computational models and analytical tools for the integration and analysis of experimental data, leading to improvements in existing theories about the nervous system and brain. Results for the clinic: Neuroinformatics provides tools, databases, models, networks technologies and models for clinical and research purposes in the neuroscience community and related fields.

VU Qualitative challenges: High quality alignments (alternative splicing) In-silico structural genomics In-silico functional genomics: reliable annotation Protein-protein interactions. Metabolic pathways: assign the edges in the networks Fluxomics, quantitative description (through time) of fluxes through metabolic networks New algorithms

VU Quantitative challenges: Understanding mRNA expression levels Understanding resulting protein activity Time dependencies Spatial constraints, compartmentalisation Are classical differential equation models adequate or do we need more individual modeling (e.g macromolecular crowding and activity at oligomolecular level)? Metabolic pathways: calculate fluxes through time Cell-cell communication: tissues, hormones, innervations Need ‘complete’ experimental data for good biological model system to learn to integrate

VU VUMC Neuropeptide – addiction Oncogenes – disease patterns Reumatic diseases

VU Quantitative challenges: How much protein produced from single gene? What time dependencies? What spatial constraints (compartmentalisation)? Metabolic pathways: assign the edges in the networks Cell-cell communication: find membrane associated components

Integrate data sources Integrate methods Integrate data through method integration (biological model) Integrative bioinformatics

Data Algorithm Biological Interpretation (model) tool Integrative bioinformatics Data integration

Data 1Data 2Data 3

Integrative bioinformatics Data integration Data 1 Algorithm 1 Biological Interpretation (model) 1 tool Algorithm 2 Biological Interpretation (model) 2 Algorithm 3 Biological Interpretation (model) 3 Data 2Data 3

“Nothing in Biology makes sense except in the light of evolution” (Theodosius Dobzhansky ( )) “Nothing in Bioinformatics makes sense except in the light of Biology” Bioinformatics