Presentation on theme: "Approximate random distribution of coefficients of correlation for two random variates = 0.03 Under a normal approximation we can use Z-transformed score."— Presentation transcript:
Approximate random distribution of coefficients of correlation for two random variates = 0.03 Under a normal approximation we can use Z-transformed score for statistical infering. P( - < X < + ) = 68% P( < X < ) = 90% P( < X < ) = 95% P( < X < ) = 99% P( < X < ) = 99.9% The Fisherian significance levels The standard normal distribution Z is standard normally distributed Lecture 2 Randomization techniques
Countrysq.kmDeltaT Albania Andorra46815 Austria Azores22007 Baleary Islands Belarus Belgium Bosnia and Herzegovina Bulgaria Canary Islands72705 Channel Is Corsica Crete Croatia Cyclades Is Cyprus Czech Republic Denmark Dodecanese Is Estonia Faroe Is Finland France Franz Josef Land Germany Gibraltar6.510 Greece Hungary Iceland Ireland Italy Kaliningrad Region Latvia Liechtenstein16014 Lithuania Luxembourg Macedonia Madeira(Funchal)7895 Malta31614 Moldova Monaco ……… Average temperature difference in European countries/islands Permutation test probability Bootstrap probability Probability level Parameters and standard errors Consider the coefficient of correlation. Statistical significance of r > 0 (H1) is tested against the null hypothesis H0 of r = 0. Most statistics programs do this using Fisher’s Z- transformation Reshuffling
Permutation testing Random numberln arealn Delta TrSim rAverage r =+ŚREDNIA(H2:H21) StdDev r ODCH.STANDARDOWE(H2:H21) tt (H2-J2)/J4*20^ P(t) E-09+ROZKŁAD.T(J7,19,2) ZZ (G2-H2)/J P(Z) =ROZKŁAD.T(J12,19,2) We reorder one of the variables at random (at least 1000 times) We calculate the mean, standard deviation, and the upper and lower confidence intervals. This gives us an estimate of how probable is the observed correlation.
The distribution of randomized correlation coefficients Observed value The distribution is not symmetric. We can’t use Z-transformed values (the normal approximation) We can’t use a t-test. Lower two-sided 1% confidence limit Upper two-sided 1% confidence limit We have to use the upper and lower probability levels. We get them directly from the random distribution Probability level for r = 0.457: P =
Jackknifing The jackknifed standard error of the coefficient of variation
Bootstrapping Take the original values and calculate the parameter you need Take 1000 random samples of different size Calculate 1000 parameters from the bootstrap samples Compare the observed value with the parameters distribution and calulate the confidence limits for the observed value
We use at least 1000 random samples and calculate for each sample CV. The standard deviation of thses CV values is an estimate of the standard error of the original CV. The standard error of a distribution is identical to the standard deviation of the sample.
Bootstrap distribution The mean CV values are based on samples of different size. The scores are therefore of different value. We have to use weighed averages
Monte Carlo simulation.
Null models Darwin finch Photo:Guardian Unlimited Do the beak length of Darwin finches as a measure of resource usage differ more or less than expected just by chance? The classical method to answer this question is to compare the observed variance in beak length differences with those obtained from a random draw of beak length inside the observed range (smallest and largest beak size being fixed). This is a null model approach We test whether this null model approach is reliable
We have randomly assigned beak length of 20 species measured in mm P (H 0 ) = 21/1000 = The null distribution gives us directly the H 0 probability. Observed variance Randomized variances
Meningitis in Europe Distribution of forests in Europe Is the probability of Meningitis infection correlated to the distribution of forests in Europe? We use a grid aproach We use the corefficient of correlation between the entries of both grids R = 0.06; P(R=0) > 0.1. The distance between the sites might be of importance.
Meningitis in Europe Distribution of forests in Europe We reshuffle rows and columns only to get the null model distribution. P (H 0 ) = 26/1000 = 0.026
Mantel test Coefficient of correlation between matrix entries For convenience we use Z- transformed data The Mantel test is a test for the correlation between two distance matrices. It tests whether distances are correlated.