A novel approach to analysis of primary HTS data Compound Set Enrichment Thibault VarinAnsgar Schuffenhauer Gubler, H., Parker, C., Zhang, JH., Raman,

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Statistics 101 Class 8. Overview Hypothesis Testing Hypothesis Testing Stating the Research Question Stating the Research Question –Null Hypothesis –Alternative.
1 Statistics Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview of Lecture Independent and Dependent Variables Between and Within Designs.
On Comparing Classifiers: Pitfalls to Avoid and Recommended Approach Published by Steven L. Salzberg Presented by Prakash Tilwani MACS 598 April 25 th.
S TATISTICAL S IGNIFICANCE KNR 164. W HAT IS S TATISTICAL S IGNIFICANCE ? A statistical hypothesis test is a method of making decisions using data from.
Sequence comparison: Significance of similarity scores Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Decision Tree Models in Data Mining
Statistical hypothesis testing – Inferential statistics I.
Introduction to Hypothesis Testing
Determining the Significance of Item Order In Randomized Problem Sets Zachary A. Pardos, Neil T. Heffernan Worcester Polytechnic Institute Department of.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Testing Hypotheses I Lesson 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics n Inferential Statistics.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Hypothesis Testing.
1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas.
Chapter 8 Introduction to Hypothesis Testing
Claims about a Population Mean when σ is Known Objective: test a claim.
Means Tests Hypothesis Testing Assumptions Testing (Normality)
GO::TermFinder Gavin Sherlock Department of Genetics Stanford University
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Chapter 8 Introduction to Hypothesis Testing
Writing Research Hypotheses Null and Alternative Hypotheses The Null Hypothesis states “There is no significance present” – Represented by H 0 The Alternative.
Hypothesis Testing A procedure for determining which of two (or more) mutually exclusive statements is more likely true We classify hypothesis tests in.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
1 ConceptsDescriptionHypothesis TheoryLawsModel organizesurprise validate formalize The Scientific Method.
4 Hypothesis & Testing. CHAPTER OUTLINE 4-1 STATISTICAL INFERENCE 4-2 POINT ESTIMATION 4-3 HYPOTHESIS TESTING Statistical Hypotheses Testing.
Chapter 8 Introduction to Hypothesis Testing ©. Chapter 8 - Chapter Outcomes After studying the material in this chapter, you should be able to: 4 Formulate.
Inferential Statistics Body of statistical computations relevant to making inferences from findings based on sample observations to some larger population.
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
Multiple Testing Matthew Kowgier. Multiple Testing In statistics, the multiple comparisons/testing problem occurs when one considers a set of statistical.
Elementary statistics for foresters Lecture 5 Socrates/Erasmus WAU Spring semester 2005/2006.
Section A Confidence Interval for the Difference of Two Proportions Objectives: 1.To find the mean and standard error of the sampling distribution.
Chapter 11 Inferences about population proportions using the z statistic.
1 Where we are going : a graphic: Hypothesis Testing. 1 2 Paired 2 or more Means Variances Proportions Categories Slopes Ho: / CI Samples Ho: / CI Ho:
Chapter Outline Goodness of Fit test Test of Independence.
Logic and Vocabulary of Hypothesis Tests Chapter 13.
Data Analysis: Analyzing Individual Variables and Basics of Hypothesis Testing Chapter 20.
Hypothesis Testing Errors. Hypothesis Testing Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean.
© Copyright McGraw-Hill 2004
1 Chapter 15 Data Analysis: Basic Questions © 2005 Thomson/South-Western.
Statistics 300: Elementary Statistics Section 11-2.
Major Steps. 1.State the hypotheses.  Be sure to state both the null hypothesis and the alternative hypothesis, and identify which is the claim. H0H0.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Chapter 9: Hypothesis Tests for One Population Mean 9.2 Terms, Errors, and Hypotheses.
Today: Hypothesis testing p-value Example: Paul the Octopus In 2008, Paul the Octopus predicted 8 World Cup games, and predicted them all correctly Is.
New FOCUS or OBSERVATION Critical Thinking Cyclic Model: QUESTION or HYPOTHESIS CONTENT ANYALYSIS and DELIBERATION scrutinize data using most rigorous.
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Hypothesis Testing Chapter Hypothesis Testing  Developing Null and Alternative Hypotheses  Type I and Type II Errors  One-Tailed Tests About.
Ex St 801 Statistical Methods Part 2 Inference about a Single Population Mean (HYP)
Unit 3 Hypothesis.
Dr.MUSTAQUE AHMED MBBS,MD(COMMUNITY MEDICINE), FELLOWSHIP IN HIV/AIDS
CONCEPTS OF HYPOTHESIS TESTING
Chapter 9: Hypothesis Testing
Hypothesis Tests for Proportions
Introduction to Hypothesis Testing
Sequence comparison: Multiple testing correction
Power and Sample Size I HAVE THE POWER!!! Boulder 2006 Benjamin Neale.
Testing Hypotheses I Lesson 9.
Presentation transcript:

A novel approach to analysis of primary HTS data Compound Set Enrichment Thibault VarinAnsgar Schuffenhauer Gubler, H., Parker, C., Zhang, JH., Raman, P., Ertl, P.

INTRODUCTION | Compound Set Enrichment | Thibault Varin | 10/07/142 Compound Set Enrichment

Introduction  Active series identification: Can relevant SAR be extracted from primary HTS data?  Are activity data binary or continuous? | Compound Set Enrichment | Thibault Varin | 10/07/143

Introduction Active series identification | Compound Set Enrichment | Thibault Varin | 10/07/144 Hypothesis 1: Within primary HTS screening data, structure activity relationships (SAR) are apparent and can be used to help selecting active compound classes.

Introduction Are the activity data binary or continuous? | Compound Set Enrichment | Thibault Varin | 10/07/145 Scaffold 1Scaffold 2 Activity Binary activity: -1 active / 5 inactives -Scaffold 1 = Scaffold 2 Continuous activity: Scaffold 1 > Scaffold 2 Active compound (binary) Inactive compound (binary)

Introduction Are the activity data binary or continuous? | Compound Set Enrichment | Thibault Varin | 10/07/146 Threshold 1 Activity Threshold 2 Activity Binary scaffold activity is different according to the threshold Active compound (binary) Inactive compound (binary) Hypothesis 2: Methods based on an activity cut-off distort the activity information leading to the incorrect assignment of active series of compounds.

METHODS | Compound Set Enrichment | Thibault Varin | 10/07/147 Compound Set Enrichment

The Scaffold Tree – Visualization of the Scaffold Universe by Hierarchical Scaffold Classification A. Schuffenhauer, P. Ertl et al. J. Chem. Inf. Model., 47, 47, 2007 Methods The Scaffold Tree classification | Compound Set Enrichment | Thibault Varin | 10/07/148

Methods Datasets | Compound Set Enrichment | Thibault Varin | 10/07/149 PubChem Annotation from CRC Simulation of the primary screening data -7 PubChem bioassays - Ranging from 9389 to compounds - Ranging from 0.03 to 26.29% of active compounds Hypothesis 1

Methods Single hypothesis test: summary procedure  1. State the null and the alternative hypotheses -H 0 : „the scaffold is inactive“ -H 1 : „the scaffold is active“  2. Specify a significance level: α=0.01  3. Compute the statistics and the p-value ) →p-value=probability that the scaffold is inactive (H 0 )  4. Decision step: -p-value> α: H 0 is accepted -p-value< α: H 0 is rejected and then H 1 is accepted „The scaffold is active“ | Compound Set Enrichment | Thibault Varin | 10/07/1410

Methods The KS and the Binomial hypothesis tests | Compound Set Enrichment | Thibault Varin | 10/07/1411 Continuous data KS test Binary data Binomial test Actives Inactives Bioassay Scaffold H 0 : there is no difference in the activity distribution defined by compounds having the scaffold S3-2 and the background distribution H 0 : there is no difference in the proportion of active compounds for compounds having the scaffold S3-2 and the proportion of active compounds for the full dataset.

Methods Multiple hypothesis tests: Bonferroni correction  Problem of false positives α =probability to identify as active an inactive scaffold (for each test done...) 100 inactive scaffolds: probability to identify an „active“ by chance is equal 63% ( ))  Suggests to test each scaffold at a critical significance level equal to α = 0.01 / Nbr of scaffolds  Makes the assumption that the individual tests are independent  Each level in the Scaffold Tree have been done separately | Compound Set Enrichment | Thibault Varin | 10/07/1412

Methods Determining the activity of classes | Compound Set Enrichment | Thibault Varin | 10/07/1413 Hypo 1 Hypo 2 Scaffold activity evaluation Comparison of results Multiple hypothesis test correction (Bonferroni)

RESULTS | Compound Set Enrichment | Thibault Varin | 10/07/1414 Compound Set Enrichment

Results Comparison of KSP and BTP predictions | Compound Set Enrichment | Thibault Varin | 10/07/1415 Bioassay Total BPCA significantly actives BPCA non significantly actives KSPBTPΔBPCAKSPBTPΔKSPBTPΔ Hydroxysteroid dehydrogenase Caspase PK Luciferase Luciferase CYP450 2C CYP450 3A With: -KSP: KS Prediction -BTP: Binomial Threshold Prediction -Δ : KSP-BTP -BPCA: Binomial PubChem Annotation Both KSP and BTP retrieve BPCA significantly active classes Number of active classes: KSP > BTP Most of new KSP active classes are not BPCA significantly actives

Results KSP significantly active scaffolds that are in Pubchem inactives | Compound Set Enrichment | Thibault Varin | 10/07/1416 Inconclusives? Inconclusive? Inconclusives? Compound activity (PubChem Annotation) Active Inconclusive Inactive WA

Results Prioritize nodes instead of individual scaffolds | Compound Set Enrichment | Thibault Varin | 10/07/1417 Scaffold activity (KS Prediction / Bonferroni) Non significantly active Significantly active

Results Visualization tool (Peter Ertl) | Compound Set Enrichment | Thibault Varin | 10/07/1418

CONCLUSION | Compound Set Enrichment | Thibault Varin | 10/07/1419 Compound Set Enrichment

Conclusion Compound Set Enrichment | Compound Set Enrichment | Thibault Varin | 10/07/1420  Validation of initial hypotheses  A method to mine HTS data and identify active series of compounds Chemical classification: Scaffold Tree Statistical analysis: Kolmogorov-Smirnov hypothesis test Multiple hypothesis test correction: Bonferroni correction  Use all primary data  No activity cut-off  Identification of new active scaffolds not necessarily represented by very active compounds (latent hits) during the primary screen

With many thanks to | Compound Set Enrichment | Thibault Varin | 10/07/1421 Acknowledgments Primary mentor: - Ansgar Schuffenhauer Scientific advisers: -Christian Parker -Hanspeter Gubler -Ji-Hu Zhang -Peter Ertl -Edgar Jacoby Help: MLI group Fellowship: Education office Discussions: -Martin Beibel -Sebastian Bergling -Meir Glick -Alain Dietrich -Marie-Cecile Didiot

Questions? | Compound Set Enrichment | Thibault Varin | 10/07/1422