Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas.

Similar presentations


Presentation on theme: "1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas."— Presentation transcript:

1 1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas

2 2 In statistics, a population is a complete set of items that share at least one property in common that is the subject of a statistical analysis; Sample - subset of the population used to make inferences about the characteristics of the population; Representative sample – a subset of a statistical population that accurately reflects the members of the entire population; Sample statistic - numerical characteristic of the sample data such as the mean, proportion or variance. It can be used to provide estimates of the corresponding population parameters. DEFINITIONS

3 3 POPULATION AND SAMPLE POPULATION_1 ● ● ●● ● ● POPULATION_k SAMPLE_1 SAMPLE_k ● ● ●● ● ●

4 4 STATISTICAL HYPOTHESES Hypothesis testing – a set of logical and statistical guidelines used to make decisions from sample statistics to population characteristics; The Null hypothesis (H 0 ) refers to a general statement or default position that there is no significant difference between two population measured phenomena; the statement that is hoped or expected to be true instead of the null hypothesis is the Alternative hypothesis (H 1 ); These two hypotheses are mutually exclusive and exhaustive.

5 5 STATISTICAL HYPOTHESES Statistical criterion – the rule according to which we accept or reject null hypothesis H 0 ; Critical value α (the value corresponding to a given significance level) – the value of a test statistic p beyond which the null hypothesis can be rejected; the probability value (p-value) of a statistical hypothesis test is the probability of getting a value of the test statistic as extreme as or more extreme than that observed by chance alone, if the null hypothesis H 0, is true. If the test statistic falls within the critical region (p< α), the null hypothesis H 0 is rejected with correct decision probability 1-α; Otherwise, if the test statistic does not reach the rejection point (p> α), then we cannot reject the null hypothesis H 0 ;

6 6 Type I Error & Type II Error: A type I error occurs when the null hypothesis H 0 is rejected when it is in fact true; A type II error occurs when the null hypothesis H 0, is not rejected when it is in fact false; It is impossible to have as small chance of making both types of errors simultaneously. The larger the chance of making a Type I error α is, the smaller the chance of making a Type II error β is; The power of a statistical hypothesis test measures the test's ability to reject the null hypothesis when it is actually false - that is, to make a correct decision. In other words, the power of a hypothesis test is the probability of not committing a type II error; power=(1- β). STATISTICAL HYPOTHESES

7 7

8 8 The area of Type I errors STATISTICAL HYPOTHESES The area of Type II errors

9 9 Partial eta-square (η 2 ) is the percent of total variance in the dependent variable accounted for by the variance between categories (groups) formed by the independent variable(s). The coefficient is "partial" because it reflects effect after controlling for other variables in the model; ; STATISTICAL HYPOTHESES

10 10 DATA TYPES Scale. Data measured on an interval or ratio scale, where the data values indicate both the order of values and the distance between values. Also referred to as quantitative or continuous data. For the scale data the desired resolution can be set; Categorical. Data with a limited number of distinct values or categories (for example, gender or marital status). Also referred to as qualitative data. Categorical variables can be string (alphanumeric) data or numeric variables that use numeric codes to represent categories. There are two basic types of categorical data:

11 11 DATA TYPES Nominal. Categorical data where there is no inherent order to the categories. For example, a job category of sales is not higher or lower than a job category of marketing or research; Ordinal. Categorical data where there is a meaningful order of categories, but there is not a measurable distance between categories. For example, there is an order to the values high, medium, and low, but the “distance” between the values cannot be calculated.

12 12 PARAMETRIC & NON-PARAMETRIC TESTS If the information about the population is completely known by means of its parameters then statistical test is called parametric test; Parametric tests - statistical procedures that use interval or ratio scaled data and assume populations or sampling distributions with normal distributions; Non-parametric tests - statistical procedures that use nominal or ordinal scaled data and make no assumptions about the distribution of the population;

13 13 The test for homogeneity is a test made to determine whether several populations are similar or equal or homogeneous in some characteristics. A test of independence assesses whether paired observations on two variables, drawing from the same population, are independent of each other; TESTS FOR HOMOGENEITY&INDEPENDENCE

14 14 PARAMETRIC & NON-PARAMETRIC TESTS Parametric Tests for comparison of means: Independent-samples t test (two-sample t test); Paired-samples t test (dependent t test); Analysis of variance (ANOVA, MANOVA, etc.); Nonparametric Tests: Kolmogorov-Smirnov (KS) test for hypothesized normality of distribution; Mann-Whitney U test, a.o. for 2 independent samples; Wilcoxon test, a.o. for 2 related samples; Kruskal-Wallis test, a.o. for K independent samples; Friedman test, a.o. for K related samples;

15 15 References 1.Formulating Hypotheses http://www.slideshare.net/aniket0013/formulating-hypotheses; 2.Hypothesis testing http://www.stats.gla.ac.uk/steps/glossary/hypothesis_testing.html;


Download ppt "1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas."

Similar presentations


Ads by Google