Comparison of several prediction errors for model evaluation K. Brendel 1, E. Comets 1, C. Laveille 2, R. Jochemsen 2, F. Mentré 1 The model (model B)

Slides:

Advertisements

Similar presentations

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.

Advertisements

LSU-HSC School of Public Health Biostatistics 1 Statistical Core Didactic Introduction to Biostatistics Donald E. Mercante, PhD.

Review of the Basic Logic of NHST Significance tests are used to accept or reject the null hypothesis. This is done by studying the sampling distribution.

Business 205. Review Sampling Continuous Random Variables Central Limit Theorem Z-test.

8 Statistical Intervals for a Single Sample CHAPTER OUTLINE

Lecture 9: One Way ANOVA Between Subjects

Analysis of Simulation Input.. Simulation Machine n Simulation can be considered as an Engine with input and output as follows: Simulation Engine Input.

Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.

Chapter 2 Simple Comparative Experiments

PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 6 Chicago School of Professional Psychology.

© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 13 Using Inferential Statistics.

8-1 Introduction In the previous chapter we illustrated how a parameter can be estimated from sample data. However, it is important to understand how.

One-Way ANOVA Independent Samples. Basic Design Grouping variable with 2 or more levels Continuous dependent/criterion variable H  :  1 =  2 =... =

1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.

Choosing Statistical Procedures

One Sample  M ean μ, Variance σ 2, Proportion π Two Samples  M eans, Variances, Proportions μ1 vs. μ2 σ12 vs. σ22 π1 vs. π Multiple.

Difference Two Groups 1. Content Experimental Research Methods: Prospective Randomization, Manipulation Control Research designs Validity Construct Internal.

Hypothesis Testing II The Two-Sample Case.

Chapter 5 Sampling and Statistics Math 6203 Fall 2009 Instructor: Ayona Chatterjee.

Chapter 8 Introduction to Hypothesis Testing

1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.

1 Clinical PK Optimal design and QT-prolongation detection in oncology studies Sylvain Fouliard & Marylore Chenel Department of clinical PK, Institut de.

The Probability of a Type II Error and the Power of the Test

The Hypothesis of Difference Chapter 10. Sampling Distribution of Differences Use a Sampling Distribution of Differences when we want to examine a hypothesis.

Sampling and Confidence Interval

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.

Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.

Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.

Two bootstrapping routines for obtaining uncertainty measurement around the nonparametric distribution obtained in NONMEM VI Paul G. Baverel 1, Radojka.

PRED 354 TEACH. PROBILITY & STATIS. FOR PRIMARY MATH Lesson 7 Continuous Distributions.

A Comparison of Statistical Significance Tests for Information Retrieval Evaluation CIKM´07, November 2007.

Tests for Random Numbers Dr. Akram Ibrahim Aly Lecture (9)

Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.

Sampling and Confidence Interval Kenneth Kwan Ho Chui, PhD, MPH Department of Public Health and Community Medicine

STEP BY STEP Critical Value Approach to Hypothesis Testing 1- State H o and H 1 2- Choose level of significance, α Choose the sample size, n 3- Determine.

Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.

Confidence intervals and hypothesis testing Petter Mostad

4 Hypothesis & Testing. CHAPTER OUTLINE 4-1 STATISTICAL INFERENCE 4-2 POINT ESTIMATION 4-3 HYPOTHESIS TESTING Statistical Hypotheses Testing.

Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.

Sample Size Considerations for Answering Quantitative Research Questions Lunch & Learn May 15, 2013 M Boyle.

Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)

Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.

Population Pharmacokinetic Characteristics of Levosulpiride and Terbinafine in Healthy Male Korean Volunteers Yong-Bok Lee College of Pharmacy and Institute.

Limits to Statistical Theory Bootstrap analysis ESM April 2006.

Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.

Tests of Random Number Generators

Power and Sample Size Anquan Zhang presents For Measurement and Statistics Club.

Chapter 10 The t Test for Two Independent Samples

STEP BY STEP Critical Value Approach to Hypothesis Testing 1- State H o and H 1 2- Choose level of significance, α Choose the sample size, n 3- Determine.

Principles of statistical testing

1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.

Homogeneity test for correlated data in ophthalmologic studies Chang-Xing Ma University at Buffalo 1.

Handout Six: Sample Size, Effect Size, Power, and Assumptions of ANOVA EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr.

BHS Methods in Behavioral Sciences I May 9, 2003 Chapter 6 and 7 (Ray) Control: The Keystone of the Experimental Method.

Lecture 8 Estimation and Hypothesis Testing for Two Population Parameters.

STEP BY STEP Critical Value Approach to Hypothesis Testing 1- State H o and H 1 2- Choose level of significance, α Choose the sample size, n 3- Determine.

Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.

MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.

Bootstrapping James G. Anderson, Ph.D. Purdue University.

ANALYSIS OF VARIANCE (ANOVA)

Hypothesis Testing I The One-sample Case

Statistical Core Didactic

STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample

Inference for Regression

Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.

AP Statistics: Chapter 7

Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine

Discrete Event Simulation - 4

CHAPTER 6 Statistical Inference & Hypothesis Testing

Some statistics questions answered:

Presentation transcript:

Comparison of several prediction errors for model evaluation K. Brendel 1, E. Comets 1, C. Laveille 2, R. Jochemsen 2, F. Mentré 1 The model (model B) was built from 2 phase II studies of an antidiabetic drug. Model B was a one compartment model with zero order absorption and first order elimination, with exponential random effects on the apparent volume of distribution (V/F) and on the apparent clearance (Cl/F). A proportional error model was selected. Two “validation” datasets were simulated according to the design of a real phase I study (12 subjects with 16 samples): the first (V true ) was simulated with the parameters values estimated previously (B); the second (V false ) was simulated using the same model, but with the mean value for V/F and Cl/F divided by two,corresponding to a bioavailability multiplied by two. We simulated these two datasets to check the ability to the metrics to validate V true and to reject V false. We considered metrics without and with simulations. These last metrics, called posterior predictive check (PPC), evaluate the adequacy between data and model by comparing a given statistic, computed with the data, to its posterior predictive distribution computed under the model. This distribution was estimated using Monte Carlo simulations with model B to obtain K datasets simulated according to the phase I design. In the following, DVsimijk is the jth simulated concentration for the ith subject and the kth simulation. Metrics based on concentrations Standardized Prediction Error for Concentration (SPEC) SPEC ij = WRES ij = (OBS ij - PRED ij )/SD(PRED ij ) obtained on datasets V using model B and their hyperparameters. Standardized Prediction Error for Concentration with Simulation (SPECS) SPECS ij = (DV ij – E(DVsim ij )) / SD(DVsim ij ) where E(DVsim ij ) and SD(DVsim ij ) are the empirical mean and SD of the K simulated DVsim ijk. Normalized Prediction Distribution Errors for Concentrations with Simulation (NPDECS) NPDECS ij are normalized PDECS ij, which are computed as the percentile of DV ij in the whole predicted distribution of the DVsim ijk. More precisely, the K values of DVsim ijk are sorted and the percentile of DV ij is the number of DVsim ijk lower than DV ij divided by K. These PDECS ij were normalized using the inverse of the cumulative density function of a normal distribution (with mean 0 and variance 1). For each of these metrics, we tested whether the mean was significantly different from 0 using a Wilcoxon signed-rank test and whether the variance was significantly different from 1 with a Fisher test. We then tested normality using the Shapiro-Wilks (SW) and the Kolmogorov-Smirnov (KS) tests. Metrics based on random effects Standardized Prediction Error for the post hoc Random effect (SPERHOC) For each random effect, SPERHOC i are computed from the Bayes estimate of the random effects (  i ) obtained on V using model B and their hyperparameters: SPERHOC i =  i /  B where  B is the variance for the parameter esimated with model B with tha initial dataset. Tests for SPERHOC are the same as for metrics on concentrations. Metrics based on hyperparameters Standardized Prediction Error on the Hyperparameters (SPEH) The posterior predictive distribution of the hyperparameters under the model was estimated by fitting the K simulated datasets (Ψ B ). The hyperparameters (Ψ V ) estimated with an independent population analysis on V were then compared to the region of acceptance of this distribution. SPEH are based on the differences of the estimated values t SPEH = (Ψ V – Ψ B )/ [(SE V ) 2 +( SE B ) 2 ] 1/2 where SE are the standard errors of estimation. We tested whether SPEH = 0 using a Wald test. Standardized Prediction Error on the Hyperparameters with simulations For the fitting of the simulations of K datasets, we compared the hyperparameters Ψ V ( estimated with an independent population analysis on V) to their posterior predictive distribution from the hyperparameter estimates in the K simulated datasets. METHODS Data V true 1 INSERM E0357, Department of Epidemiology, Biostatistics and Clinical research, AP-HP, Bichat University Hospital, Paris, France, 2 Servier,Courbevoie, France Data V false SPEC SPECS NPDECS SPERHOC on CL INTRODUCTION External validation refers to a comparison between the validation dataset (V) and the predictions from the model built using the learning dataset (B). The validation dataset is not used for model building and parameters estimation. The aim of this study was to compare criteria for the evaluation of a population pharmacokinetic model. Several types of prediction errors on concentrations, random effects and hyperparameters are proposed and evaluated on simulated validation datasets. RESULTS Even on V true, both SPEC and SPECS were found to differ significantly from a normal distribution. On the other hand, NPDECS and SPERHOC on CL/F and V/F followed a normal distribution, as illustrated on the QQ plots.The mean was not significantly different from 0 for the four metrics. For hyperparameters without simulation, Wald tests were not significant. For hyperparameters with simulation, the p value was not significant for CL/F and OmegaCL/F as illustrated on the histograms (same results for V/F and OmegaV/F). On V false, SPEC, SPECS and NPDECS were not found to follow a normal distribution, but SPERHOC on CL and V did. The mean was significantly different from 0 for the four metrics. For hyperparameters without simulation, Wald tests were significantly different for V/F and CL/F, and with simulation, the test was significant for CL and not OmegaCL/F (same results for V/F and OmegaV/F). CLSimulatedOmegaCLSimulated TESTp Meanns Variancens Normality KS SW 0.03 < TESTp Mean< Variance0.01 Normality KS SW <0.010 < TESTp Mean< Variance0.01 Normality KS SW < TESTp Meanns Variancens Normality KS SW <0.010 < TESTp Meanns Variancens Normality KS SW ns TESTp Mean< Variancens Normality KS SW TESTp Meanns Variancens Normality KS SW ns TESTp Mean0.024 Variancens Normality KS SW ns hyperparameters B Estimate (SE) Vtrue Estimate (SE) p Vfalse Estimate (SE) p CL 1.0 (0.042) 0.98 (0.159) ns 0.48 (0.083) <0.01 V40 (2.3) 42 (3.3) ns 20 (1.5) <0.01 D1 6.6 (0.22) 6.5 (0.29) ns 7.0 (0.063) ns (OmegaCL) (0.057) 0.27 (0.13) ns 0.34 (0.13) ns (OmegaV) (0.028) (0.026) ns (0.022) ns (Sigma) (0.0064) (0.0048) ns (0.0064) ns Wald tests on estimated hyperparameters With 500 simulations and fittings With 1000 simulations CONCLUSION pVtrue =ns pVfalse=0.002 pVtrue =ns pVfalse=ns For metrics on concentrations NPDECS was able to validate V true and to reject V false, while SPEC and SPECS showed less discrimination and rejected V true with the normality test. Wald tests as well as PPC on hyperparameters were also able in this example to detect model misfit. These metrics will be applied to the real phase I dataset (the design of which was used here for the simulations) and evaluated on repeated simulations of validation datasets. The type I error and the power of the proposed tests will be estimated.