Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. An integrated scheme for feature selection and parameter setting in the support vector machine modeling.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab N.Y.U.S.T. I. M. An integrated scheme for feature selection and parameter setting in the support vector machine modeling."— Presentation transcript:

1 Intelligent Database Systems Lab N.Y.U.S.T. I. M. An integrated scheme for feature selection and parameter setting in the support vector machine modeling and its application to the prediction of pharmacokinetic properties of drugs Presenter : Wu, Jia-Hao Authors : Sheng-Yong Yang, Qi Huang, Lin-Li Li, Chang- Ying Ma,Hui Zhang, Ru Bai, Qi-Zhi Teng, Ming-Li Xiang, Yu- Quan Wei AIM (2009) 國立雲林科技大學 National Yunlin University of Science and Technology

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outline Motivation Objective Methodology Experiments Conclusion Personal Comments

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation Many drug candidates fail in clinical trails are due to their unfavorable absorption, distribution, metabolism, excretion properties and toxicity (ADMET). Apply computational tools to predict ADMET properties of chemical compounds in the early design stages is important. Absorption 吸收 Distribution 分佈 Metabolism 代謝 Excretion 排泄 Toxicity 毒物

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective SVM has recently been evaluated in the prediction of ADMET of new drugs, but there are two problem still remain in SVM modeling.  Feature selection.  Parameter setting. The authors propose an integrated scheme to account for the two problems.  Feature selection – Genetic Algorithm.  Parameter setting – Conjugate Gradient method. The GA-CG SVM scheme to compared with the results of previous SVM studies.

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective (Cont.) Use the GA-CG SVM scheme to build four classification models of ADMET-related properties.  Identification of P-glycoprotein substrates and nonsubstrates (P-gp)  Prediction of human intestinal absorption. (HIA)  Prediction of compounds inducing torsades de pointes (Tdp)  Prediction of blood-brain barrier penetration. (BBB)

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – SVM – linearly separable cases Optimization problem

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – SVM - linearly non-separable cases Optimization problem

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology- nonlinear → linear The nonlinear separable cases could be transformed to linear cases by projecting the input variable into a new high-dimensional using a kernel function K(x i, x j ).  Polynomial.  Radial basis function.  Sigmoid kernel. Penalty parameter C and the kernel parameter γ

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Conjugate gradient Two parameters must be predetermined when using SVM, different pairs of (C, γ) give different levels of accuracy. Problem reduces to finding an optimal to minimize

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Conjugate gradient In minimizing problem can use the conjugate gradient method. (example)

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Conjugate gradient (Cont.)

12 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – feature selection Removed the following descriptors  Descriptors with too many zero values (>90%)  Descriptors with very small standard deviation values ( < 0.5%)  Descriptors which are highly correlated with others (correlation coefficients > 90%) Initial C and γ were set 256 and 0.01

13 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments

14 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments

15 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusion The parameter optimization (CG) in SVM modeling was able to improve further the prediction accuracy of the SVM model. All of these clearly demonstrate that considering feature selection and parameter optimization in SVM modeling can help to develop better prediction models of ADMET related properties.

16 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Comments Advantage  A good integrated scheme for SVM. Drawback  There are some proper nouns in this paper. Application  The prediction of pharmacokinetic properties of drugs.


Download ppt "Intelligent Database Systems Lab N.Y.U.S.T. I. M. An integrated scheme for feature selection and parameter setting in the support vector machine modeling."

Similar presentations


Ads by Google