R_SimuSTAT_1 Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Outline – Conducting random experiment using thesample function. – Histograms of sample mean, median, and standard deviation. – Calculation of sample quantiles. – Box-and-whisker plot. 1/31/2014 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ. 2

Key aspects of a random experiment – Conducted under uniform conditions – Unpredictable outcomes 1/31/2014 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ. 3 Random experiment Sample space Event space Probability space

Random experiment Using the sample function in R sample takes a sample of the specified size from the elements of x using either with or without replacement. [The default is sampling without replacement.] Simulation of lotto draw using the sample function. 1/31/2014 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ. 4

Calculation of the sample mean, median, and standard deviation. Assessing variation of the sample means (median, standard deviation) from different samples. Plot histograms of sample mean, median, and standard deviation. How about the maximum values of random samples? – Observe differences in histograms. 1/31/2014 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ. 7

hist(x, freq=FALSE, breaks=…) – breaks= a vector giving the breakpoints between histogram cells, a single number giving the number of cells for the histogram. – freq= If TRUE, the histogram graphic is a representation of frequencies, the counts component of the result, if FALSE, probability densities, component density, are plotted (so that the histogram has a total area of one). 1/31/2014 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ. 10

Sample quantiles 1/31/2014 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ. 11 Linear interpolation

1/31/2014 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ. 12 Not linear interpolation! These three numbers define the box. Whiskers are defined differently.

The boxplot in R boxplot(x,range=0) boxplot(x) [Default, range=1.5] boxplot(x,range=3) A box-and-whisker plot includes two major parts – the box and the whiskers. The parameter range determines how far the plot whiskers extend out from the box. If range is positive, the whiskers extend to the most extreme data point which is no more than range times the interquartile range from the box. A value of zero causes the whiskers to extend to the data extremes. 1/31/2014 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ. 13

Comparison of multiple boxplots 1/31/2014 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ. 16

1/31/2014 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ. 17 Can also use boxplot(x1,x2,x3,names=c(x1, x2, x3))

