Relating Small Molecule Structure to Small Molecule Performance

Slides:



Advertisements
Similar presentations
Chapter 2 The Process of Experimentation
Advertisements

Analysis of High-Throughput Screening Data C371 Fall 2004.
Traveling Salesperson Problem
1 H2 Cost Driver Map and Analysi s Table of Contents Cost Driver Map and Analysis 1. Context 2. Cost Driver Map 3. Cost Driver Analysis Appendix A - Replica.
Quantitative Information from Chemical Equations Coefficients in a balanced equation number of molecules (formula units, etc) number of moles 2 H 2 + O.
Alignment methods June 26, 2007 Learning objectives- Understand how Global alignment program works. Understand how Local alignment program works.
Active Learning Strategies for Drug Screening 1. Introduction At the intersection of drug discovery and experimental design, active learning algorithms.
N The Experimental procedure involves manipulating something called the Explanatory Variable and seeing the effect on something called the Outcome Variable.
Tracing Requirements 1. The Role of Traceability in Systems Development  Experience has shown that the ability to trace requirements artifacts through.
A novel interactive tool for multidimensional biological data analysis Zhaowen Luo, Xuliang Jiang Serono Research Institute, Inc.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
Università degli Studi di Modena and Reggio Emilia Dipartimento di Ingegneria dell’Informazione Prototypes selection with.
Daniel Brown. D9.1 Discuss the use of a compound library in drug design. Traditionally, a large collection of related compounds are synthesized individually.
1 Generalized Tree Alignment: The Deferred Path Heuristic Stinus Lindgreen
Copyright © 2013, 2009, 2005 Pearson Education, Inc. 1 5 Systems and Matrices Copyright © 2013, 2009, 2005 Pearson Education, Inc.
Describing Factorial Effects Kinds of means & kinds of effects Inspecting tables to describe factorial data patterns Inspecting line graphs to describe.
Function preserves sequences Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
By: HANIM MOHAMED (MP ) SITI FATIMAH ZAINI (MP091421)
Generic Tasks by Ihab M. Amer Graduate Student Computer Science Dept. AUC, Cairo, Egypt.
Selecting Diverse Sets of Compounds C371 Fall 2004.
By: HANIM MOHAMED (MP ) SITI FATIMAH ZAINI (MP091421)
Macromolecules. Objectives List the elements that make up living things. List the four kinds of macromolecules. Describe carbohydrates, lipids, fats and.
Sarah L. Keller Prof. of Chemistry Associate Dean for Research Activities for Arts and Sciences (My counterpart in Engineering is Mari Ostendorf.) One.
Predicting patterns of biological performance using chemical substructure features Diego Borges-Rivera 08/04/08.
HAWKES LEARNING Students Count. Success Matters. Copyright © 2015 by Hawkes Learning/Quant Systems, Inc. All rights reserved. Section 7.2 Counting Our.
Computational Approach for Combinatorial Library Design Journal club-1 Sushil Kumar Singh IBAB, Bangalore.
5-1-2 Synchronous counters. Learning Objectives: At the end of this topic you will be able to: draw a block diagram showing how D-type flip-flops can.
Page 1 Computer-aided Drug Design —Profacgen. Page 2 The most fundamental goal in the drug design process is to determine whether a given compound will.
Chapter 8 Introducing Inferential Statistics.
Stoichiometry Chapter 12.
JavaScript/ App Lab Programming:
Unit 1 Section 1.3.
Representational Similarity Analysis
Clustering Manpreet S. Katari.
I. Introduction to statistics
PRESENTATION AND DISCUSSION OF RESEARCH FINDINGS
Representational Similarity Analysis
What we’ve learned so far…
The Components of the Phenomenon of Repetition Suppression
Bioconjugation Bioconjugation is the process of joining of biomolecules to other biomolecules, small molecules, and polymers by chemical or biological.
Analyzing Redistribution Matrix with Wavelet
APPLICATIONS OF BIOINFORMATICS IN DRUG DISCOVERY
INFORMATION AND PROGRESS
Tree diagram Pareto Matrix diagram Check sheet Defect location map
Chapter Eight: Quantitative Methods
Hidden Markov Models Part 2: Algorithms
Please use speaker notes for additional information!
Representation of documents and queries
Designing Experiments
Critical Path Method Farrokh Alemi, Ph.D.
ML – Lecture 3B Deep NN.
Supporting Workplace Study
Statistical Data Analysis
PSY402 Theories of Learning
ECE 352 Digital System Fundamentals
Interpretation of Similar Gene Expression Reordering
Consortium: National networks in 16 European countries.
Two Halves to Statistics
Evaluating Classifiers
Applying principles of computer science in a biological context
8.3 Estimating a Population Mean
Biological Science Applications in Agriculture
Matrices are identified by their size.
ECE 352 Digital System Fundamentals
Performing the Runs Test Using SPSS
Connectivity Fingerprints: From Areal Descriptions to Abstract Spaces
Randomization and Bias
Dan Klerman Comment on Glaeser, Hillis, Kim, Kominers & Luca How Does Restaurant Compliance Affect the Returns to Algorithms? Evidence from Boston’s.
Thursday 05/16 Warm Up 200 people were surveyed about ice cream preferences. 78 people said they prefer chocolate. 65 people said they prefer strawberry.
Presentation transcript:

Relating Small Molecule Structure to Small Molecule Performance Establishing structure-activity relationship (SAR) for novel disaccharides exposed to preadipocyte cell states Hi, my name is Isaac Joseph and I’m a pre-freshman at MIT. This summer, my project has been relating small molecule structure to small molecule performance. This is called a structure activity relationship. As you saw in the previous presentation, finding the correct small molecule to produce a particular effect is useful in many situations. The data that I used to develop my methods were disaccharide small molecules used in preadipocyte (pre-fat-cell assays). Isaac Joseph

Why is finding SAR useful? Drug development Effect enhancement through structure optimization So why should we find SAR’s? Two major applications are in the pharmaceutical industry and in chemical biology research. Each activity involves the action of small molecules on living systems. During development of drugs or research tools, we often use SARs to learn how to modify small molecules to optimize their effects.

How did we previously assess SAR? Visual inspection of data and structures Introduces bias Time -consuming Computational analysis – often used with isolated proteins Doesn’t account for cellular context Usually deals with only one biological outcome How is this currently done? It’s mostly done by a bunch of chemists getting together and basically saying ok: we think X part of the molecule is relevant to Y effect, so maybe we should change X and hopefully Y will change. One problem with this is that it introduces bias; by science’s nature, the best discoveries are not always the expected ones; also, it’s time consuming because every single molecule has to be examined individually. There is some computational analysis done, but this is only done on an isolated protein level. Once the proteins are put into cells, they behave differently; also, since it’s only one protein, only one outcome can be assessed at once.

How do I assess SAR? Describe structure and performance Alter structure description to match performance Interpret results So what’s my method of finding SAR that accounts for all these problems? First, for an assay, I describe the structure of the molecules and the effect seen with these molecules computationally. Then, I simply alter the structural description until it most highly accords with the performance seen with the same molecules. Once structures are chosen, we then interpret the results for optimizing the molecules.

How do we describe elements of small molecule structure? regiochemistry structural descriptors How do I represent structure chemically? Since the molecules all share certain features, I simply represent each difference as a descriptor. There are three kinds of diversity in the library we studied; here’s regiochemistry.

How do we describe elements of small molecule structure? stereochemistry structural descriptors Here’s stereochemistry.

How do we describe elements of small molecule structure? appendages structural descriptors And here’s appendage diversity.

How do we represent small molecule structure computationally? structure matrix structural descriptors 64 small molecules Now, I take these descriptors and represent them in a binary structure matrix. Each row is a structural fingerprint of a specific compound, and the red cells indicate the presence of a certain structural feature. 50 descriptors presence absence

What biological effects were measured? 5 preadipocyte cell states Wild-type brown preadipocyte IRS 1 -/- brown preadipocyte IRS 2-/- brown preadipocyte 3T3- L1 white preadipocyte Tunicamycin modified white preadipocyte 2 measured effects Lipid accumulation Mitochondrial membrane potential What effects were measured? We took 5 states of preadipocytes – cells that will differentiate into adipocytes, which are fat cells – and measured two effects – differentiation and mitochondrial membrane potential.

How do we represent biological effects computationally? performance matrix 64 small molecules We represented these measurements in a performance matrix analogous to the earlier-shown structure matrix. Here, each row is the fingerprint of a compound in effect, where a red cell represents a particular effect. 80 assays (2 columns each for +, - effects) effect no effect

How do I assess SAR? Describe structure and performance Alter structure description to match performance Interpret results So now we’ve described structure and performance computationally. To figure out what parts of structure are relevant to performance, we simply alter our structural description until structure similarity of compounds aligns with their biological performance similarity

Step 1: Calculate similarities between small molecules in performance performance matrix 64 small molecules pairwise performance similarity 80 assays To do this, we first create a similarity matrix out of the performance matrix, to find the similarity of between each compound in this realm. Each cell in the performance matrix represents the similarity value between two compounds. effect no effect 64 small molecules 64 small molecules dissimilar similar

Step 2: Calculate similarities between small molecules in structure using partial description 8 descriptors 64 small molecules candidate structure matrix structure matrix 64 small molecules 50 descriptors pick a subset 64 small molecules pairwise structure similarity Then, we choose random structures to consider, and create an analogous structure similarity matrix of each compound. presence absence dissimilar similar

Step 3: Check similarity between performance and structure pairwise performance similarity 64 small molecules pairwise structure similarity VS. 64 small molecules We then see how similar compound performance is to our compound structural representation. 64 small molecules dissimilar similar dissimilar similar

Step 4: Change structural description until similarity between performance and structure is maximum pairwise performance similarity 64 small molecules pairwise structure similarity optimized pairwise structure similarity We repeatedly change our structural representation until we find one that best accords with performance. Then, find out which selected descriptors that gave this similarity; these have the most to do without our performance in question. all descriptors selected descriptors

How do I assess SAR? Describe structure and performance Alter structure description to match performance Interpret results So now that we’ve found the structures that have the most to do with performance, what can we do with this data?

How do we know it’s working? performance matrix structure-performance similarity (correlation coefficient) columns permuted 22 times p < 0.046 First, how can we be sure that our results are significant? be sure that our results are significant?  To test this, we randomly permuted the results from each assay and selected a structure description, then compared the results from many such permutations to our best description using the real performance data and returned a p-value of <0.05, which means that there’s less than a 5% chance that our results were obtained by accident.

What do the selected structures mean? selected descriptors optimized structure similarity matrix performance similarity matrix With the chosen descriptors, we can then create structural drawings by looking at similar blocks of small molecules chosen from the similarity matrix created from the best structural elements. If we want to optimize the effects of these similar small molecules, we should keep the chosen structures as constant (because these are already creating the desired effect), and modify the rest to see if the effect can be increased further. Here I show the disaccharides from my particular data, but we hope to expand our method for optimizations of small molecules for all purposes. 8 small molecules

Acknowledgements Analysis Paul A. Clemons (mentor) Joshua C. Gilbert Nicole E. Bodycombe Chemistry Micha Fridman Tetsuya Tanikawa Daniel Kahne Biology Bridget K. Wagner Administration Shawna L. Young Bruce Birren I’d like to thank the following people who were vital to my project…. The people that helped with analysis, especially my mentor Paul Clemons, the biologists, administration, and the people that actually created the molecules, the Kahne group at Harvard’s Department of Chemistry & Chemical Biology.