Presentation is loading. Please wait.

Presentation is loading. Please wait.

By James Miller et.all. Presented by Siv Hilde Houmb 1 November 2002

Similar presentations


Presentation on theme: "By James Miller et.all. Presented by Siv Hilde Houmb 1 November 2002"— Presentation transcript:

1 By James Miller et.all. Presented by Siv Hilde Houmb 1 November 2002
Statistical Power and its subcomponents –missing an misunderstood concepts in Software Engineering Empirical Research By James Miller et.all. Presented by Siv Hilde Houmb 1 November 2002

2 Outline Statistical Power Analysis Statistical significance testing
Direction/Non-direction of statistical test Parametric and Non-parametric tests Calculating Statistical Power Normative and judgment effect estimation approach

3 Introduction The authors informal review of the software engineering empirical literature failed to find many articles which report the statistical power of the described experiment. Why do we need it -> The power to identify false hypothesis

4 Statistical Power Analysis
Statistical power analysis is a method of increasing the probability that an effect is found in the empirical study Reject or accept a null hypothesis (denoted H0) H1 : alternative hypothesis Statistical power is defined as the probability that a statistical test will correctly reject a false null hypothesis. Eks. A power level of 0.4 means that if an experiment is run ten times, an existing effect will be discovered only four times out of ten.

5 Statistical Power Analysis cont.
High power means that if an effect exists there is a high probability that it will be found. And if an effect does not exists, you have a solid statistical argument for accepting the null hypothesis.

6 Statistical significance testing
Concerns with controlling the “fate-luck” Should have a power level of 0.8

7 Direction/Non-direction of statistical test
One tailed and two tailed tests Direction – one tailed Eks: phenomenon exists if A is larger than B Non-drection – two tailed Eks: phenomenon exists if A and B differs

8 Parametric and Non-parametric tests
A parametric statistical test requires the estimation of one or more population parameters Example: an estimate of the difference in the average between the first and the second populations A non-parametric test does not involve estimation of a specific parameter Example: provides you with an estimate of P[X>Y], probability that a randomly selected patient from your first population has a larger value than a randomly selected patient from the second population

9 Parametric and Non-parametric tests
Advantage of non-parametric test Do not require the sample population to be normally distributes Disadvantage Non-parametric test do not have the same statistical power as parametric test do Non-parametric test should only be used when substantial non-normality of the sample is believed to exist or when one wishes to be particularly conservative on the side of Type I errors.

10 Calculating Statistical Power
The significance criterion ( ): the chosen risk of committing a Type I error, that is the probability of incorrectly rejecting the null hypothesis (H0), when performing significance testing. The sample size (N): number of subjects (as large as possible) The effect size ( ): the degree to which the phenomenon under study is present in the population.

11 Calculating Statistical Power cont.
Significance criterion:  Sample size: N Effect size:  For comparison of two means, the most request used

12 The significance criterion
Type I error: the probability of incorrectly rejecting the null hypothesis (H0),  Type II error: the probability of incorrectly accepting the null hypothesis (H0),  Probability of correctly rejecting the null hypothesis: 1-  Relationship between  and  (Type I and Type II error) is / = x, which means that a false rejection of H0 is x times more serious than erroneously accepting it.

13 The sample size N Given the effect size and the significance criterion as constant, the power level of the test is directly dependent upon the sample size.. As N increases, the probability of error decreases, thus greater the precision and the higher the chance of rejecting the false null hypothesis.

14 Effect size  The effect size is the degree to which the phenomenon under study is present in the population. The larger the effect size the greater the degree a phenomenon is likely to be detected and the null hypothesis to be rejected.

15 Evaluating effect size
Normative: rely on other related empirical studies or the establishment of an empirical norm for subject of experiment Normative approach are used when conducting a replicating study in ESE since this is a fairly young research field Expert judgement: rely on the experts providing the estimate Guesswork

16 Expert Judgment The judgmental approach to the estimation and evaluation of effect sizes can simply be regarded as a consensus opinion of experts within the field of experimentation The difficulty of this task within SE is that the experts will often not fully understand the concepts of significance and effect size and hence their opinion may only address these concepts in a relatively indirect manner

17 Expert Judgment cont. Experts do also have problems with providing quantitative opinion and one then need to transfer the qualitative opinion into a quantitative value Opinion can be collected using Formal structured interviews Formal questionnaires survey


Download ppt "By James Miller et.all. Presented by Siv Hilde Houmb 1 November 2002"

Similar presentations


Ads by Google