Ondrej Ploc Part 2 The main methods of mathematical statistics, Probability distribution.

Slides:



Advertisements
Similar presentations
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Advertisements

CHAPTER 2 Building Empirical Model. Basic Statistical Concepts Consider this situation: The tension bond strength of portland cement mortar is an important.
Parametric/Nonparametric Tests. Chi-Square Test It is a technique through the use of which it is possible for all researchers to:  test the goodness.
Statistics review of basic probability and statistics.
Probability Densities
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Topic 2: Statistical Concepts and Market Returns
3-1 Introduction Experiment Random Random experiment.
Inferences About Process Quality
Today Concepts underlying inferential statistics
5-3 Inference on the Means of Two Populations, Variances Unknown
The Lognormal Distribution
4-1 Continuous Random Variables 4-2 Probability Distributions and Probability Density Functions Figure 4-1 Density function of a loading on a long,
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Nonparametric or Distribution-free Tests
Inferential Statistics
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
AM Recitation 2/10/11.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Fundamentals of Data Analysis Lecture 7 ANOVA. Program for today F Analysis of variance; F One factor design; F Many factors design; F Latin square scheme.
Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.
On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.
Statistical Decision Theory
Chapter 5 Statistical Models in Simulation
Moment Generating Functions
Traffic Modeling.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
1 Statistical Distribution Fitting Dr. Jason Merrick.
Continuous Distributions The Uniform distribution from a to b.
Hypothesis Testing A procedure for determining which of two (or more) mutually exclusive statements is more likely true We classify hypothesis tests in.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
4-1 Continuous Random Variables 4-2 Probability Distributions and Probability Density Functions Figure 4-1 Density function of a loading on a long,
Research Seminars in IT in Education (MIT6003) Quantitative Educational Research Design 2 Dr Jacky Pow.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Academic Research Academic Research Dr Kishor Bhanushali M
Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
IE 300, Fall 2012 Richard Sowers IESE. 8/30/2012 Goals: Rules of Probability Counting Equally likely Some examples.
Sampling and estimation Petter Mostad
Principles of statistical testing
Math 4030 Final Exam Review. Probability (Continuous) Definition of pdf (axioms, finding k) Cdf and probability (integration) Mean and variance (short-cut.
Chapter 4 Continuous Random Variables and Probability Distributions  Probability Density Functions.2 - Cumulative Distribution Functions and E Expected.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Chapter 20 Statistical Considerations Lecture Slides The McGraw-Hill Companies © 2012.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses pt.1.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Appendix I A Refresher on some Statistical Terms and Tests.
Probability and Statistics A
Introduction to Probability - III John Rundle Econophysics PHYS 250
Logic of Hypothesis Testing
4-1 Continuous Random Variables 4-2 Probability Distributions and Probability Density Functions Figure 4-1 Density function of a loading on a long,
Engineering Probability and Statistics - SE-205 -Chap 4
Chapter 4 Continuous Random Variables and Probability Distributions
Chapter 4. Inference about Process Quality
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
What are their purposes? What kinds?
Lecture # 2 MATHEMATICAL STATISTICS
3. Random Variables Let (, F, P) be a probability model for an experiment, and X a function that maps every to a unique point.
Chapter Fifteen Frequency Distribution, Cross-Tabulation, and
Continuous Distributions
Moments of Random Variables
Presentation transcript:

Ondrej Ploc Part 2 The main methods of mathematical statistics, Probability distribution

Outline 2.1 Assignment of the theoretical distribution to empirical distribution 2.2 Comparison of empirical and theoretical parameters, estimation of theoretical parameters, testing of parametrical hypotheses 2.3 Measurement of statistical dependences – some fundaments of regression and correlation analysis

2.1 Assignment of the theoretical distribution to empirical distribution

Goals Probable investigation of selective statistical set: Choice of acceptable theoretical distribution Probable picture of selective statistical set: Testing non- parametric hypotheses

Acquired concepts and knowledge pieces Theoretical distribution, partial survey in alphabetical order: Bernoulli, Beta, Binomial, Chi-square, Discrete Uniform, Erlang, Exponential, F, Gamma, Geometric, Lognormal, Negative binomial, Normal, Poisson, Student´s, Triangular, Uniform, Weibull Testing of non-parametric hypotheses Test of zero hypothesis H 0 Receiving or rejecting of zero hypothesis H 0 Level of statistical significance , e.g. at  = 0,05

Assigned example Hypothesis: Empirical distribution can be substituted by the normal distribution

Assigned example (2) The results of 50 test elaboration

Assignment of the theoretical distribution to empirical distribution = testing of non-parametric hypothesis Theoretical distribution is better due to simple mathematical apparatus that enables to detect the information inaccessible by another way

2.1.1 Interval division of frequencies It is recommended to construct 5 to 20 equidistant intervals of the extent of statistical sign values Sturges rule (empirical) k = log n n is the extent of selective statistical set

2.1.2 Theoretical distribution Fundamental concept of probability theory it is the rule that every value of random variable assigns the probability Random variable is the variable value which is definitely determined by result of random attempt Random attempt is a realization of activities or processes the result of which is not possible to anticipate with certainty Probability = positive random attempt results / all random attempt results (e.g. shooting at a target) Random variables can be discrete or continues

Random variable To values of random variable it is possible to assign the probabilities with which they come in the course of random attempt Theoretical distribution

Distribution function F (Cumulative) distribution function quotes the probability that a random variable RV obtains the values smaller or equal to just chosen value x i (or x) and this cumulative probability will be expressed by a summation (or integral) of partial probabilities. The probability that X lies in the semi- closed interval (a, b], where a < b, is thereforeinterval Properties: Theoretical distribution

Parameters of theoretical distributions The theoretical general, central and standardized moments O j, C j and N j Discrete: P j marks the distribution function, x i the values of random variable RV Continues:  (x) marks the probability density and the x the RV

Parameters of theoretical distributions Often the names and marks “mean value (expected value) E and dispersion (variance) D” are used, too. The expected value E is a location parameter which measures the level of random variable RV. The dispersion D is a variability parameter which measures the “diffusion” of random variable values. The expected value E is equal to theoretical general moment of 1.order O1, the dispersion D is equal to theoretical central moment of 2.order C2. The theoretical general moment of 1.order O1 is the location parameter, the theoretical central moment of 2.order C2 is the variability parameter, the theoretical standardized moment of 3.order N3 is the skewness parameter and the theoretical standardized parameter of 4.order N4 is the kurtosis parameter. The relation between empirical and theoretical parameters describes the law of large numbers. Subject to compliance with certain conditions, it can be expected that the empirical distribution and related empirical parameters will approximate the theoretical distribution and associated with him theoretical parameters. And the more, the greater the extent of selective statistical set (the larger the number of realized random attempts). Approaching the empirical parameters to the theoretical parameters has not character of mathematical convergence but probability convergence.

Binomial distribution Characteristic of random phenomenon The n independent random attempts are carried out, the probability of monitored random phenomenon is the same in the all random attempts and it is equal to p. It is sought the probability that this phenomenon occurs itself 0, 1, …, n- times. According to this definition the values x 0, x 1, …, x n of relevant random variable are given by numbers 0, 1, …, n. Theoretical distribution (probability function) For described random phenomenon the probability function is a rule which assigns the probabilities P i for i = 0, 1, …, n to the values x i of random variable. Distribution function: Discrete distribution

Binomial distribution The significance of binomial distribution A typical example of independent random attempts is a random selection of elements from a set if the selected element is returned back, so called the selection with return. It can be shown that, in the case where the extent of selective set is small in comparison with the extent of basic set, the difference between the selection with return and the selection without return is insignificant. The binomial distribution can therefore serve as a suitable criterion, whether the selective statistical set was created on the basis of random selection. Discrete distribution

Normal distribution Continues distribution

Normal distribution Continues distribution Standardized normal distribution: N(0,1) Distribution function F(u) is Laplace function

Normal distribution Continues distribution

Alternative distribution Discrete distribution Special case of binomial distribution for n = 1 The alternative distribution is discrete theoretical distribution A(p) with one theoretical parameter of zero-one random variable RV (the random variable has values xi = i = 0, 1).

Poisson distribution Discrete distribution

Geometric distribution Discrete distribution

Geometric distribution Discrete distribution

Lognormal distribution Continues distribution The lognormal distribution is continuous theoretical distribution LN(μ, σ) of random variable RV which is increasing function of random variable Y in the form x = e y (the random variable Y has normal distribution N(μ, σ)). The lognormal distribution has two theoretical parameters μ, σ.

Lognormal distribution Continues distribution

Apparatus of non-parametric testing zero hypothesis H 0 supposes that empirical distribution can be substituted by intended theoretical distribution alternative hypothesis H A then supposes that this presumption isn´t correct A comparison between theoretical and empirical absolute frequencies is the essence of testing non- parametric hypotheses.

Apparatus of non-parametric testing For the verification of non-parametric and parametric hypotheses the special group of theoretical distributions was developed – these distributions are not intended to replace the empirical distributions but they work as statistical criteria. The normal distribution is the only exception – in its standardized shape it may play a role of statistical criterion, in its non-standardized shape may substitute the empirical distributions. Standardized normal distribution (u-test), Student´ distribution (t-test), Pearson´ χ2 distribution and Fisher- Snedecor distribution (F-test) belong among the most frequent statistical criteria.

Apparatus of non-parametric testing For verification of hypotheses H0 and Ha the suitable statistical criterion is needful to select. The χ2-test is used the most frequently for verification of a non- parametric hypothesis. If the creation of interval division of frequencies is a condition for its application, it is then needful to connect the each partial interval with the absolute frequency equal to at least 5. If this condition isn´t fulfilled it is necessary to connect the partial intervals. Similarly, it is necessary to proceed to the interval division of frequencies.

Apparatus of non-parametric testing After the selection of statistical criterion (e.g., χ2-test) it is needful to come up to the determination of experimental value of this criterion (e.g., χ2-exp.) and critical theoretical value (e.g., χ2-theor.). So called the critical domain W of relevant statistical criterion will be recorded by means of the critical theoretical value. If the experimental value of selected criterion will be an element of the critical domain W it is necessary to receive the alternative hypothesis Ha – i.e. the empirical distribution cannot be substituted by intended theoretical distribution. In the contrary case (the experimental value will not be an element of the critical domain W) the zero hypothesis H0 can be received – i.e. the empirical distribution can be substituted by intended theoretical distribution.

Significance level The determination of significance level α is an essential element of testing non-parametric and parametric hypotheses. This significance level quotes the probability of erroneous rejection of tested hypothesis (i.e. the probability of the error of I. type). The most frequent significance levels are the values α = 0.05 and α = E.g., the significance level 0.05 enables for the positive test of normality (i.e. it is received the hypothesis H0 on the possibility to substitute the empirical distribution by normal distribution and the hypothesis Ha is refused) to determine the conclusion – if the selective statistical set SSS will be selected 100 times from basic statistical set BSS, in 95 cases it will be shown the empirical distribution can be substituted by normal distribution.

Illustration of Non- parametric Testing Hypothesis: Empirical distribution can be substituted by the normal distribution

Illustration of Non- parametric Testing In the course of testing the χ2-test will be applied, in the course of its application the letter k will be to refer to the number of intervals of frequency interval division, the letter r then to the number of normal distribution theoretical parameters (i.e. r = 2). The formulation ν = k–r–1 expresses the number of freedom degrees which enables together with a selected level of significance to determine the critical theoretical value χ2-teor. = χ2-k-r-1 using statistical tables. The significance level is selected α = 0,05.