Presentation is loading. Please wait.

Presentation is loading. Please wait.

Master’s course Bioinformatics Data Analysis and Tools Lecture 1: Introduction Centre for Integrative Bioinformatics FEW/FALW C E N T.

Similar presentations


Presentation on theme: "Master’s course Bioinformatics Data Analysis and Tools Lecture 1: Introduction Centre for Integrative Bioinformatics FEW/FALW C E N T."— Presentation transcript:

1 Master’s course Bioinformatics Data Analysis and Tools Lecture 1: Introduction Centre for Integrative Bioinformatics FEW/FALW heringa@few.vu.nl C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E

2 Course objectives There are two extremes in bioinformatics work –Tool users (biologists): know how to press the buttons and know the biology but have no clue what happens inside the program –Tool shapers (informaticians): know the algorithms and how the tool works but have no clue about the biology Both extremes are dangerous, need a breed that can do both

3 Course objectives How do you become a good bioinformatics problem solver? –You need to know basic analysis and data mining modes –You need to know some important backgrounds of analysis and prediction techniques (e.g. statistical thermodynamics) –You need to have knowledge of what has been done and what can be done (and what not) Is this enough to become a creative tool developer? –Need to like doing it –Experience helps

4 Contents (tentative dates) DateLecture TitleLecturer 1 [wk 19] 07/05/07IntroductionJaap Heringa 2 [wk 19]10/05/07Microarray data analysisJaap Heringa 3 [wk 20]14/05/07Molecular simulations & sampling techniques Anton Feenstra 4 [wk 21] 22/05/07Introduction to Statistical Thermodynamics IAnton Feenstra 5 [wk 21] 24/05/07Introduction to Statistical Thermodynamics II Anton Feenstra 6[wk 23] 05/06/07Machine learningElena Marchiori 7[wk 23] 07/06/07Clustering algorithmsBart van Houte 8[wk 24] 11/06/07Support vector machines and feature selection in bioinformaticsElena Marchiori 9[wk 24] 12/06/07Databases and parsingSandra Smit 10[wk 24] 14/06/07OntologiesFrank van Harmelen 11[wk 25] 18/06/07 Benchmarking, parallelisation & grid computing Thilo Kielmann 12[wk 25] 19/06/07Method development I: Protein domain prediction Jaap Heringa 13[wk 25] 21/06/07Method development II Jaap Heringa

5 At the end of this course… You will have seen a couple of algorithmic examples You will have got an idea about methods used in the field You will have a firm basis of the physics and thermodynamics behind a lot of processes and methods You will have an idea of and some experience as to what it takes to shape a bioinformatics tool

6 Bioinformatics “Studying informatic processes in biological systems” (Hogeweg) Applying algorithms and mathematical formalisms to biology (genomics) “Information technology applied to the management and analysis of biological data” (Attwood and Parry-Smith)

7 This course General theory of crucial algorithms (GA, NN, HMM, SVM, etc..) Method examples Research projects within own group –Repeats –Domain boundary prediction Physical basis of biological processes and of (stochastic) tools

8 Bioinformatics Large - external (integrative)ScienceHuman Planetary ScienceCultural Anthropology Population Biology Sociology SociobiologyPsychology Systems Biology Biology Medicine Molecular Biology Chemistry Physics Small – internal (individual) Bioinformatics

9 Genomic Data Sources DNA/protein sequence Expression (microarray) Proteome (xray, NMR, mass spectrometry, PPI) Metabolome Physiome (spatial, temporal) Integrative bioinformatics

10 Protein structural data explosion Protein Data Bank (PDB): 14500 Structures (6 March 2001) 10900 x-ray crystallography, 1810 NMR, 278 theoretical models, others...

11 Mathematics Statistics Computer Science Informatics Biology Molecular biology Medicine Chemistry Physics Bioinformatics Bioinformatics inspiration and cross-fertilisation

12 Algorithms in bioinformatics string algorithms dynamic programming machine learning (NN, k-NN, SVM, GA,..) Markov chain models hidden Markov models Markov Chain Monte Carlo (MCMC) algorithms stochastic context free grammars EM algorithms Gibbs sampling clustering tree algorithms (suffix trees) graph algorithms text analysis hybrid/combinatorial techniques and more…

13 Joint international programming initiatives Bioperl http://www.bioperl.org/wiki/Main_Page http://bioperl.org/wiki/How_Perl_saved_human_genome Biopython http://www.biopython.org/ BioTcl http://wiki.tcl.tk/12367 BioJava www.biojava.org/wiki/Main_Page

14 Integrative bioinformatics @ VU Studying informational processes at biological system level From gene sequence to intercellular processes Computers necessary We have biology, statistics, computational intelligence (AI), HTC,.. VUMC: microarray facility, cancer centre, translational medicine Enabling technology: new glue to integrate New integrative algorithms Goals: understanding cellular networks in terms of genomes; fighting disease (VUMC)

15 Bioinformatics @ VU Progression: DNA: gene prediction, predicting regulatory elements, alternative splicing mRNA expression Proteins: (multiple) sequence alignment, docking, domain prediction, PPI Metabolic pathways: metabolic control Cell-cell communication

16 Fold recognition by threading: THREADER and GenTHREADER Query sequence Compatibility scores Fold 1 Fold 2 Fold 3 Fold N

17 Polutant recognition by microarray mapping: Compatibility scores Cond. 1 Cond. 2 Cond. 3 Cond. N Contaminant 1 Contaminant 2 Contaminant 3 Contaminant N Query array

18 ENFIN WP4 Functional threading From sequence to function –Multiple alignment –Secondary structure prediction, Solvation prediction, Conservation patterns, Loop enumeration

19 ENFIN WP4 Functional threading From sequence to function –Multiple alignment –Secondary structure prediction, Solvation prediction, Conservation patterns, Loop enumeration D H S StructFunc DHS DB of active site descrip tors

20 ENFIN WP5 - BioRange (Anton Feenstra) Protein-protein interaction prediction Mesoscopic modelling Soft-core Molecular Dynamics (MD) –Fuzzy residues –Fuzzy (surface) locations

21 ENFIN WP6 Silicon Cell –Database of fully parametrized pathway model (differential equations) solver Jacky Snoep (Stellenbosch, VU/IBIVU) Hans Westerhoff (VU, Manchester)

22 Where are important new questions?

23 New neighbouring disciplines Translational Medicine A branch of medical research that attempts to more directly connect basic research to patient care. Translational medicine is growing in importance in the healthcare industry, and is a term whose precise definition is in flux. In particular, in drug discovery and development, translational medicine typically refers to the "translation" of basic research into real therapies for real patients. The emphasis is on the linkage between the laboratory and the patient's bedside, without a real disconnect. This is often called the "bench to bedside" definition. Computational Systems Biology Computational systems biology aims to develop and use efficient algorithms, data structures and communication tools to orchestrate the integration of large quantities of biological data with the goal of modeling dynamic characteristics of a biological system. Modeled quantities may include steady-state metabolic flux or the time-dependent response of signaling networks. Algorithmic methods used include related topics such as optimization, network analysis, graph theory, linear programming, grid computing, flux balance analysis, sensitivity analysis, dynamic modeling, and others. Neuro-informatics Neuroinformatics combines neuroscience and informatics research to develop and apply the advanced tools and approaches that are essential for major advances in understanding the structure and function of the brain

24 Translational Medicine “From bench to bed side” Genomics data to patient data Integration

25 Natural progression of a gene

26 TERTIARY STRUCTURE (fold) Genome Expressome Proteome Metabolome Functional Genomics From gene to function

27

28

29 Systems Biology is the study of the interactions between the components of a biological system, and how these interactions give rise to the function and behaviour of that system (for example, the enzymes and metabolites in a metabolic pathway). The aim is to quantitatively understand the system and to be able to predict the system’s time processes the interactions are nonlinear the interactions give rise to emergent properties, i.e. properties that cannot be explained by the components in the system

30 Systems Biology understanding is often achieved through modeling and simulation of the system’s components and interactions. Many times, the ‘four Ms’ cycle is adopted: Measuring Mining Modeling Manipulating

31

32

33 A system response Apoptosis: programmed cell deathNecrosis: accidental cell death

34 Neuroinformatics Understanding the human nervous system is one of the greatest challenges of 21st century science. Its abilities dwarf any man-made system - perception, decision-making, cognition and reasoning. Neuroinformatics spans many scientific disciplines - from molecular biology to anthropology.

35 Neuroinformatics Main research question: How does the brain and nervous system work? Main research activity: gathering neuroscience data, knowledge and developing computational models and analytical tools for the integration and analysis of experimental data, leading to improvements in existing theories about the nervous system and brain. Results for the clinic: Neuroinformatics provides tools, databases, models, networks technologies and models for clinical and research purposes in the neuroscience community and related fields.

36

37 Bioinformatics @ VU Qualitative challenges: High quality alignments (alternative splicing) In-silico structural genomics In-silico functional genomics: reliable annotation Protein-protein interactions. Metabolic pathways: assign the edges in the networks Fluxomics, quantitative description (through time) of fluxes through metabolic networks New algorithms

38 Bioinformatics @ VU Quantitative challenges: Understanding mRNA expression levels Understanding resulting protein activity Time dependencies Spatial constraints, compartmentalisation Are classical differential equation models adequate or do we need more individual modeling (e.g macromolecular crowding and activity at oligomolecular level)? Metabolic pathways: calculate fluxes through time Cell-cell communication: tissues, hormones, innervations Need ‘complete’ experimental data for good biological model system to learn to integrate

39 Bioinformatics @ VU VUMC Neuropeptide – addiction Oncogenes – disease patterns Reumatic diseases

40 Bioinformatics @ VU Quantitative challenges: How much protein produced from single gene? What time dependencies? What spatial constraints (compartmentalisation)? Metabolic pathways: assign the edges in the networks Cell-cell communication: find membrane associated components

41 Integrate data sources Integrate methods Integrate data through method integration (biological model) Integrative bioinformatics

42 Data Algorithm Biological Interpretation (model) tool Integrative bioinformatics Data integration

43 Data 1Data 2Data 3

44 Integrative bioinformatics Data integration Data 1 Algorithm 1 Biological Interpretation (model) 1 tool Algorithm 2 Biological Interpretation (model) 2 Algorithm 3 Biological Interpretation (model) 3 Data 2Data 3

45 “Nothing in Biology makes sense except in the light of evolution” (Theodosius Dobzhansky (1900-1975)) “Nothing in Bioinformatics makes sense except in the light of Biology” Bioinformatics


Download ppt "Master’s course Bioinformatics Data Analysis and Tools Lecture 1: Introduction Centre for Integrative Bioinformatics FEW/FALW C E N T."

Similar presentations


Ads by Google