Project Plan Task 8 and VERSUS2 Installation problems Anatoly Myravyev and Anastasia Bundel, Hydrometcenter of Russia March 2010.

Slides:



Advertisements
Similar presentations
Review bootstrap and permutation
Advertisements

Happiness comes not from material wealth but less desire. 1.
Uncertainty in fall time surrogate Prediction variance vs. data sensitivity – Non-uniform noise – Example Uncertainty in fall time data Bootstrapping.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Sections 7-1 and 7-2 Review and Preview and Estimating a Population Proportion.
Sampling: Final and Initial Sample Size Determination
Chapter 11- Confidence Intervals for Univariate Data Math 22 Introductory Statistics.
Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis Friday: last lecture: review – Bring questions DEC 8 – 9am FINAL.
ELEC 303 – Random Signals Lecture 18 – Statistics, Confidence Intervals Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 10, 2009.
3 pivot quantities on which to base bootstrap confidence intervals Note that the first has a t(n-1) distribution when sampling from a normal population.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
Topics: Inferential Statistics
1 Confidence Interval for a Mean. 2 Given A random sample of size n from a Normal population or a non Normal population where n is sufficiently large.
1 A heart fills with loving kindness is a likeable person indeed.
PSY 1950 Nonparametric Statistics November 24, 2008.
2008 Chingchun 1 Bootstrap Chingchun Huang ( 黃敬群 ) Vision Lab, NCTU.
Bootstrapping LING 572 Fei Xia 1/31/06.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
CH7 Distribution Free Inference: Computer-Intensive Techniques 1.Random Sampling 2.Bootstrap sampling 3.Bootstrap Testing.
Quiz 6 Confidence intervals z Distribution t Distribution.
STAT 572: Bootstrap Project Group Members: Cindy Bothwell Erik Barry Erhardt Nina Greenberg Casey Richardson Zachary Taylor.
Bootstrapping applied to t-tests
Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1.
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
1 Summarizing Performance Data Confidence Intervals Important Easy to Difficult Warning: some mathematical content.
AM Recitation 2/10/11.
Confidence Intervals (Chapter 8) Confidence Intervals for numerical data: –Standard deviation known –Standard deviation unknown Confidence Intervals for.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.
Lecture 14 Sections 7.1 – 7.2 Objectives:
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Random Sampling, Point Estimation and Maximum Likelihood.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Estimation PowerPoint Prepared by Alfred P. Rovai.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
Biostatistics IV An introduction to bootstrap. 2 Getting something from nothing? In Rudolph Erich Raspe's tale, Baron Munchausen had, in one of his many.
PARAMETRIC STATISTICAL INFERENCE
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Slide 1 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 1 n Learning Objectives –Identify.
1 Nonparametric Methods II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Resampling techniques
Sampling And Resampling Risk Analysis for Water Resources Planning and Management Institute for Water Resources May 2007.
Sections 7-1 and 7-2 Review and Preview and Estimating a Population Proportion.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Estimation PowerPoint Prepared by Alfred P. Rovai.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Bootstraps and Jackknives Hal Whitehead BIOL4062/5062.
Section 6-3 Estimating a Population Mean: σ Known.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Estimates and Sample Sizes Chapter 6 M A R I O F. T R I O L A Copyright © 1998,
Bootstrap Event Study Tests Peter Westfall ISQS Dept. Joint work with Scott Hein, Finance.
Nonparametric Methods II 1 Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Case Selection and Resampling Lucila Ohno-Machado HST951.
1 Probability and Statistics Confidence Intervals.
Week 111 Review - Sum of Normal Random Variables The weighted sum of two independent normally distributed random variables has a normal distribution. Example.
Chapter 12 Inference for Proportions AP Statistics 12.2 – Comparing Two Population Proportions.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Modern Approaches The Bootstrap with Inferential Example.
Quantifying Uncertainty
ESTIMATION OF THE MEAN. 2 INTRO :: ESTIMATION Definition The assignment of plausible value(s) to a population parameter based on a value of a sample statistic.
Estimating standard error using bootstrap
Chapter 3 INTERVAL ESTIMATES
Logic of Hypothesis Testing
Standard Errors Beside reporting a value of a point estimate we should consider some indication of its precision. For this we usually quote standard error.
Chapter 3 INTERVAL ESTIMATES
When we free ourselves of desire,
SA3202 Statistical Methods for Social Sciences
Quantifying uncertainty using the bootstrap
Introduction to Inference
Bootstrap - Example Suppose we have an estimator of a parameter and we want to express its accuracy by its standard error but its sampling distribution.
Ch13 Empirical Methods.
Chapter 12 Inference for Proportions
Techniques for the Computing-Capable Statistician
Introductory Statistics
Presentation transcript:

Project Plan Task 8 and VERSUS2 Installation problems Anatoly Myravyev and Anastasia Bundel, Hydrometcenter of Russia March 2010

Task 8: Statistical features like confidence intervals and the Bootstrap method

Formal definition of confidence intervals (CIs): Estimation of an unknown value  defines a distribution Р  corresponding to a random sample X from the population  ={Р  }. If for a given α>0 there exist random variables   =   (α, Х) such that P  (  – <  <  + )  1– α, then the interval (  –,  + ) is called the confidence interval for  of level 1– α. The random interval contains the unknown value , which is not random.

The statistical problem lies in the construction of CIs Cases with known probability distribution function of the population: parametric CIs Cases where the pdf is not known: non-parametric CIs

Parametric CIs Normal distribution assumption is most frequent. The underlying sample must be an iid-sample (independent and identically distributed). Pluses: –Easy and not computer-intensive Minuses: –Cannot be used for scores with non-normal distributions without some normalization (proportions, odds ratio, correlation coefficients, …), or require complicated calculation formulas

Non-parametric CIs Construction of artificial datasets from a given collection of real data by resampling the observations. Pluses: –Highly adaptable to different testing situations because no assumptions regarding an underlying theoretical distribution of data are required –Computational ease Minuses: –The assumptions for sample statistics must not be overlooked: representativeness, iid

Bootstrapping Operates by constructing the artificial data using sampling with replacement from the original data (Efron 1979, Wassermann 2006) Highly elaborated computational technique (R-project) The most common and popular resampling method in verification (Wilks 1995)

Different bootstrap methods – how to construct CIs from the samples obtained Percentile CIs Bias-corrected Cis (BSa) Normal approximation CIs Basic bootstrap CIs Bootstrap-t CIs Approximated bootstrap CIs (ABC), etc. A compromise between their accuracy and computational burden must be made. used at present in MET Package

Implementation of CIs using R package boot Boot is one of the required packages for R verification package The intention is to introduce commands analogous to the MySQL v_index table in a form like index_booted<-boot(index(fcs,obs), 1000) index_ci<-(index_booted, conf=c(0.95, 0.99), type=c(“perc, ”bca”)

Conclusions The accuracy of statistical scores depends among other things on the following: –Sampling uncertainty –Validity of assumptions about representativeness and iid of the sample –Observational uncertainty –Uncertainty in the physical processes (Gilleland, 2008) Different α can be used (e.g. CIs of level 0.95, 0.99, even 0.70, etc) depending on the scope of analysis Bayesian prediction intervals?

Conclusions (2) In view of ambiguities about a “most precise” method for the CI construction, we should try several procedures on real frc and obs data available. Both parametric and non- parametric statistics are rightful (MET experience!) The decision making (what is good, what is bad) should be performed on the multi-criteria basis

Problems with VERSUS2 functioning In the Hydrometcenter of Russia

Problems with VERSUS2 functioning Installation is done in the RedHat environment without errors The new data leave traces in the MySQL tables and the test (Pirmin-) files are acquired However, the data information gets lost in the vicinity of the Data Availability tab (Model? Date Intervals?...) A tutorial variant for the package is urgently needed with valid obs and frc data

Thank you for your attention!