Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hierarchical Classification of Calculated Molecular Descriptors

Similar presentations


Presentation on theme: "Hierarchical Classification of Calculated Molecular Descriptors"— Presentation transcript:

1 Hierarchical Classification of Calculated Molecular Descriptors
Prediction of Biological Partition Coefficients: Calculated Molecular Descriptors vs Experimentally Determined Properties Denise Mills1, Subhash C. Basak1, Brian D. Gute1, and Moiz M. Mumtaz2 1Natural Resources Research Institute, University of Minnesota Duluth, Duluth, MN, USA 2 Computational Toxicology Laboratory, Division of Toxicology, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30333, USA Results Abstract Biological partition coefficients are routinely used as input parameters in physiologically based pharmacokinetic (PBPK) models, which are developed for the assessment of chemical toxicity. In this study, predictive quantitative structure-activity relationship (QSAR) models for rat and human biological partition coefficients, namely blood:air, fat:air, brain:air, liver:air, muscle:air, and kidney:air, were developed utilizing experimentally determined partition coefficients for 131 chemicals obtained from the literature and calculated molecular descriptors based solely on chemical structure. The descriptors were partitioned into four hierarchical classes, including topostructural, topochemical, 3-dimensional, and ab initio quantum chemical. Three types of regression methodologies—ridge regression, principal components regression, and partial least squares—were used comparatively in the development of these models. In addition to the structure-based models, ordinary least squares regression was used to develop comparative models based on experimentally determined properties including saline:air and olive oil:air partition coefficients. The results of the study indicate that many of the structure-based models are comparable or superior to their respective property-based models. This is an important result considering that structural descriptors can be calculated quickly and inexpensively for both existing chemicals and those not yet synthesized. With respect to the structure-based models, it was also found that ridge regression outperformed principal components regression and partial least squares regression, and that generally the topochemical descriptors alone produced models of good predictive ability. The descriptors found to be most influential in biological partitioning of chemicals include those which encode information regarding hydrogen bonding, polarity, and molecular size and shape. 3-methylcyclohexanone Chemist’s representation of structure Topostructural Model (TS) Simple graph: Purely structural representation Topochemical Model (TC) Chemical graph: Contains chemical and valency information Geometrical Model (3D) 3-Dimensional: Based on chemical graph Quantum Chemical Model (QC) H = E Based on quantum mechanics Complexity Hierarchical Classification of Calculated Molecular Descriptors * Topochemical descriptors ** Saline:air + oil:air partition coefficients Actual vs Predicted Values: Rat Brain:Air PC Structural Descriptors Experimental Properties Predicted Biological Partition Coefficients PBPK Model QSAR QPAR Predicted Chemical Toxicity Current Study or Statistical Analysis Regression Methodologies Conventional ordinary least squares (OLS) regression was used to create the QPAR models. However, OLS is not appropriate when the number of descriptors exceeds the number of chemicals in the data set, therefore, ridge regression (RR) was used to develop the QSAR models.* RR is an alternative linear method that: Makes use of all descriptors as opposed to subset regression Is useful when the number of descriptors exceeds the number of observations Is useful when the descriptors are highly intercorrelated Cross Validation The cross-validated R2 is based on the leave-one-out approach. Unlike a fitted R2, the R2c.v. does not increase upon the addition of irrelevant descriptors, but rather ends to decrease, providing a reliable measure of model predictive ability. Identification of Important Descriptors Important descriptors were identified according to high | t | value, where t is the model coefficient divided by its standard error. Conclusions Experimentally Determined Biological Partition Coefficient Data Used in QSAR and QPAR Model Development Statistical Analysis The structure-based models are comparable to the property-based models with respect to predictive ability, an important result considering that structural descriptors can be calculated quickly and inexpensively for any chemical, real or hypothetical. The most predictive structure-based models were those based on the easily calculated topological indices. Addition of the 3-dimensional and/or quantum chemical descriptors did not result in model improvement. The structural descriptors most important in the prediction of biological partitioning are those which encode information regarding hydrogen bonding, polarity, and molecular size and shape. This, and other studies have shown that a large pool of structural descriptors capable of representing diverse molecular and submolecular features is capable of predicting a wide range of properties. * Although PLS and PCR were also used to develop QSAR models, RR provided the best results. Only RR results are reported here. Source:C.J.W. Meulenberg, H.P.M. Vijverberg. Toxicol. Appl. Pharmacol. 165, 206 (2000). Disclaimer: The opinions expressed are those of the authors and not necessarily represent the opinion or policy of the agency ATSDR.


Download ppt "Hierarchical Classification of Calculated Molecular Descriptors"

Similar presentations


Ads by Google