Presentation is loading. Please wait.

Presentation is loading. Please wait.

Use of Machine Learning in Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

Similar presentations


Presentation on theme: "Use of Machine Learning in Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course."— Presentation transcript:

1 Use of Machine Learning in Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course

2 2CBS, Department of Systems Biology Major Aspects of Chemoinformatics Databases: Development of databases for storage and retrieval of small molecule structures and their properties. Machine learning: Training of Decision Trees, Neural Networks, Self Organizing Maps, etc. on molecular data. Predictions: Molecular properties relevant to drugs, virtual screening of chemical libraries, system chemical biology networks…

3 3CBS, Department of Systems Biology Machine Learning

4 4CBS, Department of Systems Biology

5 5

6 6

7 7

8 8

9 9

10 10CBS, Department of Systems Biology

11 11CBS, Department of Systems Biology

12 12CBS, Department of Systems Biology

13 13CBS, Department of Systems Biology

14 14CBS, Department of Systems Biology

15 15CBS, Department of Systems Biology

16 16CBS, Department of Systems Biology

17 17CBS, Department of Systems Biology

18 18CBS, Department of Systems Biology Machine learning classifiers

19 19CBS, Department of Systems Biology Clustering: Self Organizing Maps Distinguishing molecules of different biological activities and finding a new lead structure

20 20CBS, Department of Systems Biology Clustering: Self Organizing Maps Distinguishing molecules of different biological activities and finding a new lead structure

21 21CBS, Department of Systems Biology Clustering: Self Organizing Maps Distinguishing molecules of different biological activities and finding a new lead structure

22 22CBS, Department of Systems Biology Clustering: Self Organizing Maps Distinguishing molecules of different biological activities and finding a new lead structure

23 23CBS, Department of Systems Biology Machine Learning

24 24CBS, Department of Systems Biology Machine Learning Molecular Structures Properties Molecular Descriptors QSAR Virtual Screening Clustering Classification

25 25CBS, Department of Systems Biology Different descriptor types Simple feature counts (such as number of rotatable bonds or molecular weight) Fragmental descriptors which indicate the presence or absence (or count) of groups of atoms and substructures Physicochemical properties (density, solubility, vdWaals volume) Topological indices (size, branching, overall shape)

26 26CBS, Department of Systems Biology Major Aspects of Chemoinformatics Databases: Development of databases for storage and retrieval of small molecule structures and their properties. Machine learning: Training of Decision Trees, Neural Networks, Self Organizing Maps, etc. on molecular data. Predictions: Molecular properties relevant to drugs, virtual screening of chemical libraries, system chemical biology networks…

27 27CBS, Department of Systems Biology In QSAR models structural parameters (descriptors) are fitted to experimental data for biological activity (or another given property, P) Quantitative Structure-Activity Relationships (QSAR)

28 28CBS, Department of Systems Biology Prediction of Solubility, ADME & Toxicity

29 29CBS, Department of Systems Biology hERG Classification with SVM

30 30CBS, Department of Systems Biology Evaluation of the data set

31 31CBS, Department of Systems Biology Performance of SVM

32 32CBS, Department of Systems Biology Performance of SVM

33 33CBS, Department of Systems Biology Virtual screening  Computational techniques for a rapid assessment of large libraries of chemical structures in order to guide the selection of likely drug candidates.

34 34CBS, Department of Systems Biology Similarity Search Similar Property Principle – Molecules having similar structures and properties are expected to exhibit similar biological activity. Thus, molecules that are located closely together in the chemical space are often considered to be functionally related.

35 35CBS, Department of Systems Biology Fingerprints-based Similarity Search –widely used similarity search tool –consists of descriptors encoded as bit strings –Bit strings of query and database are compared using similarity metric such as Tanimoto coefficient MACCS fingerprints: 166 structural keys that answer questions of the type: Is there a ring of size 4? Is at least one F, Br, Cl, or I present? where the answer is either TRUE (1) or FALSE (0)

36 36CBS, Department of Systems Biology Tanimoto Similarity or 90% similarity

37 37CBS, Department of Systems Biology Similarity Search

38 38CBS, Department of Systems Biology Questions?

39 39CBS, Department of Systems Biology Molecular editors and viewers http://www.chemaxon.com/products/marvin/

40 40CBS, Department of Systems Biology http://jmol.sourceforge.net/ Molecular editors and viewers

41 41CBS, Department of Systems Biology Format conversion http://cactus.nci.nih.gov/translate/


Download ppt "Use of Machine Learning in Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course."

Similar presentations


Ads by Google