S TUDY O F T HE S ECOND V IRIAL C OEFFICIENTS : N EW C HALLENGE F OR QSPR Elena Mokshyna, Victor E. Kuz’min, Vadim I. Nedostup.

Slides:



Advertisements
Similar presentations
Data Mining For Credit Card Fraud: A Comparative Study
Advertisements

Correlation and Regression Analysis Many engineering design and analysis problems involve factors that are interrelated and dependent. E.g., (1) runoff.
32W MW (kg/mol) T (K)P (torr)P (Pa)A (m 2 ) /4N/VdN/dt (s -1 ) dN/dt (g/hr) E E dN/dt = A /4 N/V Density.
Physical Chemistry I (TKK-2246)
PTT 201/4 THERMODYNAMICS SEM 1 (2013/2014) 1. 2 Objectives Develop rules for determining nonreacting gas mixture properties from knowledge of mixture.
Support Vector Machines Elena Mokshyna Odessa 2014.
Pattern Recognition and Machine Learning: Kernel Methods.
CHEE 311Lecture 161 Correlation of Liquid Phase Data SVNA 12.1 Purpose of this lecture: To show how activity coefficients can be calculated by means of.
VOLUMETRIC PROPERTIES OF PURE FLUIDS
« هو اللطیف » By : Atefe Malek. khatabi Spring 90.
Ideal Gases and the Compressibility Factor 1 mol, 300K for various gases A high pressure, molecules are more influenced by repulsive forces. V real.
Support Vector Machine
Kernel methods - overview
A Parallel Statistical Learning Approach to the Prediction of Building Energy Consumption Based on Large Datasets Hai Xiang ZHAO, Phd candidate Frédéric.
Fig Graph of absolute pressure versus temperature for a constant-volume low- density gas thermometer.
4 Th Iranian chemometrics Workshop (ICW) Zanjan-2004.
Thermodynamic Property Relations
Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar.
A Kernel-based Support Vector Machine by Peter Axelberg and Johan Löfhede.
Lasso regression. The Goals of Model Selection Model selection: Choosing the approximate best model by estimating the performance of various models Goals.
Ensemble Learning (2), Tree and Forest
Chemical Quantities.  Calculate the mass of compounds.  Calculate the molar volumes of gases.  Solve problems requiring conversions between mass, and.
Application and Efficacy of Random Forest Method for QSAR Analysis
1 The ideal gas EOS can be written in many different ways...
Real gases 1.4 Molecular interactions 1.5 The van de Waals equation 1.6 The principle of corresponding states Real gases do not obey the perfect gas law.
The Distributive Property Purpose: To use the distributive property Outcome: To simplify algebraic expressions.
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
Predicting Highly Connected Proteins in PIN using QSAR Art Cherkasov Apr 14, 2011 UBC / VGH THE UNIVERSITY OF BRITISH COLUMBIA.
Combining Statistical and Physical Considerations in Deriving Targeted QSPRs Using Very Large Molecular Descriptor Databases Inga Paster and Mordechai.
Unit 9 Reg Chem Review. The Kinetic Molecular Theory states that gas particles are ____________ and are separated from one another by lots of _________.
Biostatistics Unit 9 – Regression and Correlation.
Identifying Applicability Domains for Quantitative Structure Property Relationships Mordechai Shacham a, Neima Brauner b Georgi St. Cholakov c and Roumiana.
when system is subdivided? Intensive variables: T, P Extensive variables: V, E, H, heat capacity C.
Chapter 8 Real Gases.
Mole-Volume Relationships- Gases The volumes of one mole of different solid and liquid substances are not the same (water vs. glucose). One mole of glucose.
ChemE 260 Equations of State April 4, 2005 Dr. William Baratuci Senior Lecturer Chemical Engineering Department University of Washington TCD 2: E & F CB.
GASES.
Advance Chemical Engineering Thermodynamics By Dr.Dang Saebea.
Evaluation of a Targeted-QSPR Based Pure Compound Property Prediction System Abstract The use of the DD – TQSPR (Dominant-Descriptor Targeted QSPR) method.
Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.
Analysis of sapflow measurements of Larch trees within the inner alpine dry Inn- valley PhD student: Marco Leo Advanced Statistics WS 2010/11.
Physical Property Modeling from Equations of State David Schaich Hope College REU 2003 Evaluation of Series Coefficients for the Peng-Robinson Equation.
Blind Information Processing: Microarray Data Hyejin Kim, Dukhee KimSeungjin Choi Department of Computer Science and Engineering, Department of Chemical.
Note: You must memorize STP and the gas laws!!. The Kinetic Molecular Theory states that gas particles are ____________ and are separated from one another.
컴퓨터 과학부 김명재.  Introduction  Data Preprocessing  Model Selection  Experiments.
Selection of Molecular Descriptor Subsets for Property Prediction Inga Paster a, Neima Brauner b and Mordechai Shacham a, a Department of Chemical Engineering,
MUNICIPAL SOLID WASTE BIODEGRADATION, COMPACTION AND WATER HOLDING CAPACITY. RAJENDRA.D. VAIDYA.
Find the total area of the rectangles using two different expressions 5 ft 10 ft+ 4 ft 5 ft.
INCLUDING UNCERTAINTY MODELS FOR SURROGATE BASED GLOBAL DESIGN OPTIMIZATION The EGO algorithm STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION GROUP Thanks.
Chapter 5 Single Phase Systems
A) I. I. Mechnikov National University, Chemistry Department, Dvorianskaya 2, Odessa 65026, Ukraine, b) Department of Molecular.
A molecular descriptor database for homologous series of hydrocarbons ( n - alkanes, 1-alkenes and n-alkylbenzenes) and oxygen containing organic compounds.
Unit 3 Section : Regression  Regression – statistical method used to describe the nature of the relationship between variables.  Positive.
GASES. Gases  The physical state of gases is defined by several physical properties  Volume  Temperature  Amount (commonly expressed as number of.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Notes Over 1.2 Express the power in words. Then write the meaning. Word Meaning.
Energy. Energy is classified: Kinetic energy – energy of motion Potential energy – energy of position Both energies can be transferred from one object.
Perturbation Theory. Perturbation theory for the square-well potential Hard spheres is one of the few cases in which we can find the radial distribution.
공정 열역학 Chapter 3. Volumetric Properties of Pure Fluids
CH 13 The Chemistry of Gases Gases are elements (He), elemental substances (O 2 ), or compounds (CO 2 ) in which the particles of the substance are widely.
Chapter 2 The First Law Unit 5 state function and exact differentials Spring 2009.
Parul Institute of Engineering & Technology
Fundamentals of Chemical Engineering - II
Molar Volume.
دانشگاه شهیدرجایی تهران
1-7: The Distributive Property
تعهدات مشتری در کنوانسیون بیع بین المللی
Notes Over 1.2.
C2H6 CH3 B. Empirical Formula
Development of a Novel Equation of State (EoS) Model
Presentation transcript:

S TUDY O F T HE S ECOND V IRIAL C OEFFICIENTS : N EW C HALLENGE F OR QSPR Elena Mokshyna, Victor E. Kuz’min, Vadim I. Nedostup

W HY C HALLENGE ? The compressibility factor is expressed as a series expansion in either density (reciprocal molar volume) or pressure: Main purposes: Development of approach to QSPR of T-dependent properties Calibration of the descriptors Prediction for new complex organic compounds

E XPERIMENTAL D ATA  Number of compounds: 262  Number of points: 4787  Range of virial coefficients: – 391 cm 3 /mol  Range of temperatures: 110 – 773 K

D ESCRIPTORS & M ODELLING T ECHNIQUES  SiRMS descriptors:  Temperature as a single descriptor: B = f(T)  Two-layer QSPR model: a = f(descriptors) b = f(descriptors) B = f(a, b)

S TATISTICAL A NALYSIS  Various statistical methods: MLR (Multi-Linear Regression) PLS (Projection on Latent Structures) RF (Random Forest) SVM (Support Vector Machines) with radial basis function kernel  Rigorous 3x5-fold stratified external cross-validation Training set Test set ! Data on virial coefficient of compound under all the temperatures are put in the test set

R ESULTS for B = f(T) R 2 ws = 0.53 R 2 ts = 0.19 R 2 ws = 0.71 R 2 ts = 0.45 R 2 ws = 0.87 R 2 ts = 0.68 R 2 ws = 0.94 R 2 ts = 0.71

R ESULTS for B = f (a,b) a = f(descriptors), b = f(descriptors ) R 2 ws = 0.88 R 2 ts = 0.51 R 2 ws = 0.90 R 2 ts = 0.72 R 2 ws = 0.98 R 2 ts = 0.85 R 2 ws = 0.95 R 2 ts = 0.75

E XPERIMENTAL E RRORS VS. E RRORS OF M ODELS

Relative Variable Influence

Influential fragments * * * * * Some examples from the generated fragments library :

So…. M ISSION I S P OSSIBLE, B UT C HALLENGE I S N OT C OMPLETED !

Thank you for the attention!