Daylight and Discovery

Slides:



Advertisements
Similar presentations
Scientific & technical presentation JChem Cartridge for Oracle
Advertisements

Scientific & technical presentation Fragmenter Nóra Máté Sept 2005.
JKlustor clustering chemical libraries presented by … maintained by Miklós Vargyas Last update: 25 March 2010.
Structural Search Using ChemAxon Tools
1 Szabolcs Csepregi May, 2005 Structural Search Using ChemAxon Tools.
Physical Geology Chapter 4. Matter Anything that has mass and volume Elements = simplest stable form of matter Made of atoms Compounds = chemically combined.
Chapter 3 Stoichiometry. Section 3.1 Atomic Masses Mass Spectrometer – a device used to compare the masses of atoms Average atomic mass – calculated as.
Topic A: Atoms and the Elements
The Mole Concept Goal: To develop the concept of the mole as a useful measurement and to apply this in calculations involving mass and volume.
Measuring amounts of substance.. Relative atomic mass The link between the mass of an molecule and the number of atoms it contains is the relative atomic.
Chapter 2 Atoms, Molecules, and Ions. LAW OF CONSERVATION OF MASS Antoine Lavoisier ( ) During a chemical change, the total mass remains constant.
Mass Relationships in Chemical Reactions Chapter 3.
1 Molecular Mass & Mr. ShieldsRegents Chemistry U04 L01.
Basic concepts for general chemistry Lec.2. Important definitions Element  Any substance that contains only one kind of an atom.  Each element is represented.
William L Masterton Cecile N. Hurley Edward J. Neth University of Connecticut Chapter 3 Mass Relations.
Molecular Descriptors
Chemical Symbols and Formulas
Chapter 7 – The Mole and Chemical Composition
8 th Grade Chemistry in Review Take this quiz and look at your score. Reflect. Take the quiz again as may times as you need to in order to feel good about.
LecturePLUS Timberlake1 Chapter 2 Atoms and Elements The Atom Atomic Number and Mass Number Isotopes.
Atomic Weight What does it mean?. The Mass of an Atom The mass of an atom is measured in atomic mass units (amu) aka Dalton. The atomic mass of an atom.
Quantitative Chemistry Chapter 3. Objectives Learning objective 1.2 The student is able to select and apply mathematical routines to mass data to identify.
Use of Machine Learning in Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.
Moles. Definition A mole is the mass of a substance which contains the same number of particles as 12 grams of the isotope carbon 12. These particles.
Chapter 7.3.  How do we use these?  These indicate which of the elements make up a substance.  These also indicate the number of ions or atoms that.
General Chemistry Chapter 2 Definitions Left click your mouse to continue.
1 Cheminformatics David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Selecting Diverse Sets of Compounds C371 Fall 2004.
Chapter 4 Atomic Structure. Atom Atom – smallest part of an element that retains the properties of that element. Atomic Theory – proposed by John Dalton.
Review for Final Exam and Work on Final Project. Today’s Agenda – 1/21/11 Opener: What is the difference between an endothermic and exothermic reaction?
1.Each element has a different symbol 2.The formula for a compound shows the elements in the compound 3.It also shows the ratio of the atoms of different.
Section 6.1 Atoms and Moles 1.To understand the concept of average mass 2.To learn how counting can be done by weighing 3.To understand atomic mass and.
Honors BIO 9/4 (B Day) & 9/6 (A Day) OBJECTIVE: Students will be able to: Describe chemical compounds Compare and contrast chemical bonds Create a representation.
More chemical quantities Percent composition and empirical formulas.
Use of Machine Learning in Chemoinformatics
B.6 Symbols, Formulas, and Equations. Chemical Language Letters Symbols Words Formulas Sentences Equations.
Revision Task Fundamental Chemistry USE THE STEPS ON THE NEXT THREE SLIDES AND YOUR REVISION GUIDE TO MAKE A POSTER, MIND MAP, FLASH CARDS OR A RECORDING.
Chemical Formulas. Mass Percent Also known as percent composition. Shows the mass of each element in a compound as a percent of the total mass of the.
Avogadro’s Number and Molar Conversions Mole: the SI base unit used to measure the amount of a substance whose number of particles is the same as the number.
CHEMICALFORMULAS. Lesson Objectives Use a chemical formula or mass data to calculate the percent composition of a compound. Use the percent composition.
The Mole Concept. How many particles are present? In chemistry, it is important to know the # of atoms (or molecules) that you have. In chemistry, it.
1 The Atom Atomic Number and Mass Number Isotopes.
Year 11 Chemistry Relative Atomic Masses Mass Spectrometry.
WRITING AND NAMING CHEMICAL FORMULAS. STANDARDS Predict chemical formulas based on the number of valence electrons and oxidation numbers Name and write.
Drawing Lewis Structures of Molecules Chapter 4 Section 4.
Lecture #2 Advanced Theory of Computation. Languages & Grammar Before discussing languages & grammar let us deal with some related issues. Alphabet: is.
Organic Chemistry The magic of the carbon atom. Organic Chemistry Objectives Bonding of the carbon atom.
CHEMISTRY of the Atom.
Atoms Ions and isotopes
Chapter 3: Chemical Foundations: Elements, Atoms, and Ions
Formula Weights © 2012 Pearson Education, Inc..
F321 Atoms, Bonds and Groups
Law of Conservation of Mass
Pay Attention this is really important!
a. the number of neutrons b. the number of valence electrons
What will be on the test on 10-30?
1st Semester Final Exam Outline Chapters 1-8 & 24
Average Atomic Mass.
Chapter 8 The Mole.
Chemistry: Chemical Reactions and Properties of Matter
Lecture 0202 Atomic Weights and the Periodic Table
Elements and the Periodic Table
Section 3: Naming Compounds and Writing Formulas
Spreadsheets, Modelling & Databases
Chapter 2 Atoms and Molecules
Quantifying atoms and Molecules
COMPILER CONSTRUCTION
Chemistry Chapter 3 Section 3
Describing a crystal to a computer: How to represent and predict material structure with machine learning Keith T Butler.
Presentation transcript:

Daylight and Discovery How do I impress the boss when I get back? 11/18/2018

What is Discovery? A constant fight against the hedgehogs!! 11/18/2018

What have I learned this week? Above all you have learned new languages that allow you to communicate chemical concepts to, and between, machines. These languages also allow you to communicate these concepts via machines to your colleagues. You have also learned about other descriptions of a molecular structure, such as fingerprints. 11/18/2018

Language recap SMILES SMARTS SMIRKS (FINGERPRINTS) 11/18/2018

SMILES SMILES contains the same information as might be found in an extended connection table. The primary reason SMILES is more useful than a connection table is that it is a linguistic construct, rather than a computer data structure. SMILES is a true language, albeit with a simple vocabulary (atom and bond symbols) and only a few grammar rules. SMILES can be canonicalised. I.e. there is a unique, universal “name” for a structure SMILES representations of structure can in turn be used as “words” in the vocabulary of other languages designed for storage and retrieval of chemical information .E.g HTML, XML or query languages such as SQL. 11/18/2018

SMILES syntax [atom]bond[atom] etc atom : ‘[‘ <mass> symbol <chiral> <hcount> <sign<charge>> <‘:’class> ‘]’ ; bond : <empty> | ’-’ | ‘=‘ | ‘#’ | ‘:’ | ‘.’ Common elements, in the organic subset B,C,N,O,P,S,F,Cl,Br,I, in their lowest common valence state(s), can be written without brackets. If bonds are omitted, they default to single or aromatic, as appropriate, for juxtaposed atoms. 11/18/2018

Example SMILES 11/18/2018

SMARTS In the SMILES language, there are two fundamental types of symbols: atoms and bonds. Using these SMILES symbols, one can specify a molecule's graph (its "nodes" and "edges") and assign "labels" to the components of the graph (that is, say what type of atom each node represents, and what type of bond each edge represents). The same is true in SMARTS: One uses atomic and bond symbols to specify a graph. However, in SMARTS the labels for the graph's nodes and edges (its "atoms" and "bonds") are extended to include "logical operators" and special atomic and bond symbols; these allow SMARTS atoms and bonds to be more general. For example, the SMARTS atomic symbol [C,N] is an atom that can be aliphatic C or aliphatic N; the SMARTS bond symbol "~" (tilde) matches any bond 11/18/2018

Example SMARTS 11/18/2018

Useful SMARTS Heavy atom [!$([#6,#7,#8,#9,#15,#16,#17,#35,#53])] Rotatable bonds [!$(*#*)&!D1]-&!@[!$(*#*)&!D1] Secondary amides [N&H1&D2]-&!@[#6&X3] H-donors [!#6;!H0] H-acceptors [$([!#6;+0]);!$([F,Cl,Br,I]);!$([o,s,nX3]);!$([Nv5,Pv5,Sv4,Sv6])] Isolating carbons [#6;!$(C(F)(F)F);!$(c(:[!c]):[!c]);!$([#6]=,#[!#6]);!$([#6;!+0])] Stereo atoms [$([X4&!v6&!v5;H0,H1]),$([SX3]([#6])([#6])~O)] Stereo bonds [CX3;!H2]=[CX3;!H2] Stereo allenes [CX3;H0]=C=[CX3;H0,H1] 11/18/2018

Rotatable bonds [!$(*#*)&!D1]-&!@[!$(*#*)&!D1] An atom which is NOT triply bonded to another atom AND NOT 1-connected ( I.e. Not terminal ) Bonded by A single bond AND NOT a ring bond to the same type of atom 11/18/2018

Chemical Information Concepts in Discovery Matching Total Partial Similarity Qualitative Quantitative Both matching and similarity are opinions as they depend on descriptors. 11/18/2018

Filtering Quite often you may wish to eliminate compounds which are inappropriate for some activity or test. E.g. Delete any molecule from a list which contains a “heavy metal” i.e. a non-common element > $CONTRIB/smarts_filter -v \ ‘[!$([#6,#7,#8,#9,#15,#16,#17,#35,#53])]’ 11/18/2018

Counting things Count matches to patterns defined in SMARTS Molecular formula H-donors H-acceptors Rotatable bonds Chiral centres Rings Fragments 11/18/2018

Example Molecular formula C13H22N4O3S H-donors 2 H-acceptors 6 Rotatable bonds 8 Chiral centres 1 Rings 1 Fragments 6 11/18/2018

Estimating Measured Properties Any property which is an additive constitutive property of a molecule can be calculated by counting the matches of the constituent patterns lookup the weight for the pattern summing the products of the count and individual pattern weights. apply any correction factors 11/18/2018

Examples of properties to calculate Molecular Weight logP Parachor Molar Volume Molar Refractivity ………. 11/18/2018

Molecular weight: a simple example Molecular formula (count(atom(i))*atomic_weight(atom(i))) Accuracy depends on accuracy of atomic weights ( IUPAC) C13H22N4O3S 314.45 (average molecular weight ) 314.141235 ( accurate mass of commonest isotope) 11/18/2018

CLOGP: A more complicated example Algorithmic definition of fragment Pattern = NOT an isolating carbon Match the pattern to find all the fragments Look up the fragment value(s) ( if it exists ) using the unique string(s) from the match. Accumulate the values for fragments and non-fragments (isolating carbons). Correct for proximity 11/18/2018

CLOGP example 2 * Cl +1.880 guanidyl –1.930 2 * C +0.390 6 * c +0.780 7 * H +1.589 Proximity –0.984 Total +1.727 11/18/2018

Estimating values for concepts Flexibility Ratio of number of rotatable bonds to total number of bonds Rigidity Molecular similarity between original molecule and molecules formed by breaking all rotatable bonds Difficulty of synthesis Ratio of number of potential chiral centres weighted for rings to total number of heavy atoms in a molecule 11/18/2018

Example Flexibility 0.38 Rigidity 0.3819 Difficulty of synthesis 0.05 11/18/2018

Example Flexibility 0.38(0.00) Rigidity 0.3819(1.00) Difficulty of synthesis 0.05 (0.85) Figures in parentheses for morphine 11/18/2018

Relationships between compounds Compound sets Molecular descriptors Fingerprints etc Similarity measures Tanimoto etc Clustering Jarvis-Patrick etc 11/18/2018

Relationships between compounds Mixtures Molecular descriptors Modal Fingerprints etc Similarity measures Tanimoto etc Prototypes Family Resemblance 11/18/2018

Relationships between compounds Reactions Molecular descriptors Fingerprints Rôles Schemes/pathways Similarity and clustering 11/18/2018

Examples Creating a spreadsheet of properties. Non-standard fingerprinting and similarity. 11/18/2018

Don’t let the hedgehogs take over….. 11/18/2018

Don’t let the hedgehogs take over….. 11/18/2018