Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bioinformatics and Natural Computing DISCo Departmental Workshop 2010-06-03.

Similar presentations


Presentation on theme: "Bioinformatics and Natural Computing DISCo Departmental Workshop 2010-06-03."— Presentation transcript:

1 Bioinformatics and Natural Computing DISCo Departmental Workshop 2010-06-03

2 Outline BIMIB: BIonformatics MIlano Bicocca Research areas and new directions http://bimib.disco.unimib.it http://bimib.disco.unimib.it People Cooperations 2010-06-032DISCo UNIMIB Departmental Workshop

3 Research areas and directions Sequence Analysis Motif Finding, SNP classification, Haplotyping, Alternative Splicing Prediction Statistical Analysis of Biological Experiments Association Studies, Microarray Analysis, Clustering, Redescriptions Algorithmics Approximation Algorithms for Combinatorial Problems in Computational Biology (MAST, LCS, Fingerprint clustering …) Biomedical Ontologies Collaborative Association Studies, Phenotype Ontology Development Natural Computing Theory and applications of Membrane Systems Splicing Systems and Formal Languages DNA Word Design Evolutionary computing Systems Biology Models of biological systems Stochastic Simulation of Biochemical Processes Data Mining 2010-06-033DISCo UNIMIB Departmental Workshop

4 Natural Computing

5 Natural computing The work conducted in this area concerns the study of models of computation that are inspired by nature The most important research lines that the BIMIB group is pursuing are centered on –DNA computing –Membrane systems –Evolutionary and Genetic computing 2010-06-03DISCo UNIMIB Departmental Workshop 5

6 Natural Computing: basic research Much of the type of research done in these areas can be characterized as theoretical computer science, where questions of decidability, computational complexity and expressive power are paramount In particular: –Relations with languages in the usual Chomsky hierarchy –Comparison with other computational models –Complexity aspects related to time and space resources –Application of the model to the solution of computationally hard problems –Fitness-driven Importance Sampling techniques for evolutionary algorithms –Operators-Driven Distance Measures 2010-06-03DISCo UNIMIB Departmental Workshop 6

7 Natural Computing: applications Some applications include: –Description of cellular phenomena or cellular structures (e.g., Mechanosensitive channels, Sodium-Potassium pump, …)‏ –Analysis of the behaviour of complex systems, by means of stochastic models –Design of software simulators to return meaningful information to biologists –Automatic assessment of system's biology parameters –Automatic mining of microarray datasets 2010-06-03DISCo UNIMIB Departmental Workshop 7

8 Bioinformatics

9 Bioinformatics: sequence analysis applications One of the major applications of informatics to the molecular biology lies in the application of string analysis algorithms to the study of nucleic acids and proteomic sequences 2010-06-03DISCo UNIMIB Departmental Workshop 9

10 Bioinformatics: sequence analysis applications Alternative splicing prediction –Alternative splicing (AS) is considered one of the main mechanisms able to explain the huge gap between the number of predicted genes and the high complexity of proteome in human. –Main goal is the development of fast and reliable computational tools for analyzing and predicting AS from Expressed Sequence Tag (ESTs) and other genomic data –ASPIC (Alternative Splicing PredICtion) tool 2010-06-03DISCo UNIMIB Departmental Workshop 10

11 Bioinformatics: sequence analysis applications Approximate Pattern Discovery –Given a set of nucleotide or protein sequences, find all the motifs or conserved patterns, i.e.: All patterns that occur (with a maximum allowed number of mutations, insertions or deletions) in every sequence of the set All patterns that occur (as above) in a “surprisingly” high number of sequences The pattern “closer” to the sequences under some distance measure –Pattern discovery: The WeederWeb System 2010-06-03DISCo UNIMIB Departmental Workshop 11

12 Bioinformatics: sequence analysis applications Phylogenetic Reconstruction and Comparison –Computational complexity and algorithmic solution of optimization problems derived by specific instances of the more general problem of comparing phylogenies (or evolutionary networks) to combine them into a single representation (i.e. an evolutionary tree or network). –A basic problem we investigate in comparative phylogenetics is the reconciliation (or inference) of species tree from gene trees 2010-06-03DISCo UNIMIB Departmental Workshop 12

13 Bioinformatics: sequence analysis applications Haplotype Inference (HI) and Genetic Variation Analysis –Design and experimentation of algorithm for solving combinatorial problems related to haplotype inference and genetic variations analysis. –Specific computational problems of interest are: inferring the complete information on haplotypes from (incomplete or partial) haplotypes or genotypes efficient reconstruction of the perfect phylogeny describing the evolutionary history of Single Nucleotide Polymorphisms (SNPs) data in presence of recurrent mutations 2010-06-03DISCo UNIMIB Departmental Workshop 13

14 Statistical Data Analysis of High Throughput Data

15 Statistical Data Analysis of Biological Experiments The amount of data generated by high-throughput (non-sequencing) biotechnology apparatuses is huge –Microarray –microRNA –Proteomic machinery (cfr. mass-spectrometry) 2010-06-03DISCo UNIMIB Departmental Workshop 15

16 Statistical Data Analysis of Biological Experiments Statistical methods of various kinds are necessary to validate hypotheses and perform data mining operations The research pursued by the group in this area concentrated on –Time course data analysis with kernel methods evaluation of ontological “enrichments” –Multiple data sources integration for mass-spectrometry data with mutual information scoring –Application of Evolutionary and Genetic computing for the assessment of features (biological markers and combination of biological markers) in gene assays 2010-06-03DISCo UNIMIB Departmental Workshop 16

17 Biomedical Ontologies Engineering

18 Biomedical Ontologies The need for common vocabularies and “ontologies” used to label and/or model data has been recognized as a cornerstone of community research by biologists and physicians The BIMIB group worked on using ontologies for two applications –Enrichment studies (cfr., statistical analysis) –Definition of new ontologies for clinical applications and genotype-phenotype associations 2010-06-03DISCo UNIMIB Departmental Workshop 18

19 Biomedical Ontologies NeuroWEB The NeuroWEB project was concluded in 2009 –The aim of the NEUROWEB project is to support association studies in the field of neurovascular medicine, with a special commitment to genotype-phenotype relations –In particular, in the NEUROWEB project, the phenotype is formulated on the basis of the patients’ clinical data, eventually leading to the comprehensive assessment of the patients’ pathological state 2010-06-03DISCo UNIMIB Departmental Workshop 19

20 Biomedical Ontologies NeuroWEB Three main ontological layers (10 Top Phenotypes - ~200 Low Phenotypes - ~300 Core Data Set elements) is organized in taxonomies A set of ontological relations (17 object properties) to: –Connect the leaves of the three layers –Enable complex phenotype construction; Accessory layers (anatomical parts, quantitative/qualitative attributes, …) 2010-06-03DISCo UNIMIB Departmental Workshop 20

21 Biomedical Ontologies NeuroWEB 2010-06-03DISCo UNIMIB Departmental Workshop 21 CDS TOP PHENOTYPE Onto Relations LOW PHENOTYPE Onto Relations

22 Systems Biology Simulation and Analisys

23 Simulation of biological systems Systems biology is the study of a biological system emergent properties once modeled (and simulated) as a set of interacting parts Different kinds of simulations are possible –Deterministic (differential equations) –Stochastic (Gillespie’s algorithm, a form of Monte Carlo algorithms) 2010-06-03DISCo UNIMIB Departmental Workshop 23

24 Stochastic Simulation The modeling formalism: –Membrane (P) systems The simulator –C language –Desktop PC –Cluster DISCo and CINECA with MPI implementation –Algorithm: modified Gillespie’s algorithm with τ-leaping 2010-06-03DISCo UNIMIB Departmental Workshop 24

25 Studying stochasticity in biological systems 2 kinds of noise: –intrinsic noise - due to the inherent nature of the biochemical interactions –extrinsic noise - due to the external environmental conditions Complex systems such as the biological ones are non-linear and often exhibits many steady states, bifurcations or chaotic behavior 2010-06-03DISCo UNIMIB Departmental Workshop 25

26 Stochastic simulations: applications Molecular and cellular scale: –transport proteins Na+/K+ pump, Ca2+ channels, mechanosensitive channels –chemical reactions Belousov-Zhabotinsky, Michaelis-Menten –cellular signaling pathways EGFR, Ras/cAMP/PKA –bacterial colonies Vibrio fischeri, Pseudomonas aeruginosa 2010-06-03DISCo UNIMIB Departmental Workshop 26

27 Biological systems simulations: Colon Rectal Crypts Three-dimensional schematic of a crypt in the mouse small intestine. The positions of the individual cells show how things might look in a typical crypt. The Paneth cells tend toward the bottom, where they contribute to innate immunity by responding to bacterial infection (Ayabe et al. 2000). The numbers on the cells show the transit cell generation i, as in the Ti of Figure 12.6. The stem cells vary in actual cellular position in the range 3– 7, but on average appear to be around cell position 4 when numbered from the bottom. The figure only shows the bottom 7 cell positions of the approximately 15 positions. CSC abbreviates "clonogenic stem cell" (see Figure 12.6). Redrawn from Marshman et al. (2002). Copied from NCBI Frank’s online book 2010-06-03DISCo UNIMIB Departmental Workshop 27

28 People BIMIB DISCo Marco Antoniotti Paola Bonizzoni Claudio Ferretti Alberto Leporati Giancarlo Mauri Raffaella Rizzi Leonardo Vanneschi Claudio Zandron Italo Zoppis Roslyn Sagaya Mary Antonath Stefano Beretta Mauro Castelli Paolo Cazzaniga Gianluca Colombo Antonella Farinaccio Luca Manzoni Dario Pescini Yuri Pirola Antonio Enrico Porreca Andrea Valsecchi 2010-06-03DISCo UNIMIB Departmental Workshop 28

29 Other People Francesco Archetti, DISCo Enza Messina, DISCo Enzo Martegani, BtBs Marco Vanoni, BtBs Riccardo Dondi, Un. Bergamo Gianluca Della Vedova, Statistica, UNIMIB Daniela Besozzi, Un. Milano Giulio Pavesi, Un. Milano Graziano Pesole, Un. Bari Mario Giacobini, Un. Torino Paolo Provero, Un. Torino Manuela Gariboldi, IFOM-IEO James Reid, IFOM-IEO Luciano Milanesi, ITB CNR Marco Pierotti, Istituto Nazionale dei Tumori Giovanna Castoldi, Medicina, UNIMIB Fulvio Magni, Medicina, UNIMIB 2010-06-03DISCo UNIMIB Departmental Workshop 29

30 Other People International Daniele Merico – Un. Toronto, Toronto, Canada Gary Bader – Un. Toronto, Toronto, Canada Bud Mishra – NYU, New York, USA Naren Ramakrishnan – Virginia Tech, Blacksburg, VA, USA Victor Moreno – ICOncologia, Barcellona, Spain Miguel-Angel Pujana – ICOncologia, Barcellona, Spain Laura Slaughter – National Technical University of Norway (NTNU), Norway Aristotelis Chatzioannou – EIE, Athens, Greece Viktor Malyshkyn – Center for Supercomputing, Russian Academy of Sciences, Novosibirsk, Russia 2010-06-03DISCo UNIMIB Departmental Workshop 30

31 Conferences and Workshops Signs Symptoms and Findings Workshop 2009, September 2009, Milan, Italy 2010-06-03DISCo UNIMIB Departmental Workshop 31

32 International cooperation BIMIB DISCo is the institutional contact point for all initiatives concerning the EC Virtual Physiological Human Network of Excellence ( www.vph-noe.eu ) www.vph-noe.eu 2010-06-03DISCo UNIMIB Departmental Workshop 32

33 Funding Ongoing –FAR –EnviGP - Improving Genetic Programming for the Environment and Other Applications, Programa Operacional Factores de Competitividade, Fundação para a Ciência e a Tecnologia (FCT), Portugal (PTDC/EIA- CCO/103363/2008) –ProteomeNet - Rete Nazionale per lo studio della proteomica umana, FIRB Pending –EU FP7 ICT Virtual Physiological Human CRControl (coordinator) BioBridge (partner) –Regione Lombardia, Programma ASTIL –Regione Lombardia, Programma Quadro/Università –PRIN 2009 2010-06-03DISCo UNIMIB Departmental Workshop 33

34 Publications All publications authored by BIMIB affiliates and collaborators are listed on the group web site and on the digidisco platform http://bimib.disco.unimib.it/index.php/ Special:Publications/en http://bimib.disco.unimib.it/index.php/ Special:Publications/en 2010-06-03DISCo UNIMIB Departmental Workshop 34

35 THANK YOU 2010-06-03DISCo UNIMIB Departmental Workshop 35


Download ppt "Bioinformatics and Natural Computing DISCo Departmental Workshop 2010-06-03."

Similar presentations


Ads by Google