Cyclic Structures Numbers indicate start and stop of ring Same number indicates start and end of the ring, entered immediately following the start/end atoms Only numbers 1 – 9 are used A number should appear only twice Atom can be associated w. 2 consecutive numbers, e.g., Napthalene: c12ccccc1cccc2
SMILES Conventions Avoid two consecutive left parentheses if possible Strive for the fewest number of possible branches Tautomeric bonds are not designated; enter the appropriate form
Further Restrictions A branch cannot begin a SMILES notation A branch cannot immediately follow a double- or triple-bond symbol Example: C=(CC)C is invalid, but C(=CC)C or C(CC)=C are valid SMILES
SMILES Fragments Nitro Nitrate Nitrite Sulfonic acid Cyanide/Nitrile Azide Azido N(=O)(=O) ON(=O)(=O) ON(=O) S(=O)(=O)O C#N N=N#N N+=N-
SMILES Metals [Al] [As] [Au] [Be] [Bi] [Cd] [Ca] [Fe] [Hg] [K] [Li] [Mg] [Na] [Ni] [Pt] [Sb] [Sn] [Zn] [Zr]
Disconnected Structures Tetramethyl ammonium bromide C[N+]C(C)C.[Br-]
Isomeric and Chiral SMILES Isomeric configuration indicated by forward and backward slashes: / \ Examples: –trans-1,2-dibromoethene: Br/C=C/Br –cis-1,2-dibromoethene: Br/C=C\Br Chirality indicated by the symbol
Some Applications JMDraw/SMILESViewer (Christoph Steinbeck) JME Molecular Editor (Peter Ertl) STN Express (SMILES as output) Tripos (dbtranslate: SMILES to MOL) Marvin (Ferenc Csizmadia) CACTVS erlangen.de/cactvs/ erlangen.de/cactvs/
Another Application SMILESCAS Database Over 103,000 SMILES notations Input CAS Registry Number Leads to SMILES and thence to a structure search