Presentation is loading. Please wait.

Presentation is loading. Please wait.

Review of Basic Statistics. Definitions Population - The set of all items of interest in a statistical problem e.g. - Houses in Sacramento Parameter -

Similar presentations


Presentation on theme: "Review of Basic Statistics. Definitions Population - The set of all items of interest in a statistical problem e.g. - Houses in Sacramento Parameter -"— Presentation transcript:

1 Review of Basic Statistics

2 Definitions Population - The set of all items of interest in a statistical problem e.g. - Houses in Sacramento Parameter - A descriptive measure of a population e.g. - Mean (average) appraised value of all houses Sample - A set of items drawn from a population e.g. - 100 randomly selected homes Statistic - A descriptive measure of a sample e.g. - Mean appraised value of selected homes Statistical inference - The process of making an estimate, prediction, or decision based upon sample data

3 Types of Data Qualitative - Categorical, i.e., data represents categories e.g. - Existence of an attached garage Quantitative - Data are numerical values Discrete(countable) - Counts of things e.g. - Number of bedrooms Continuous(interval) - Measurements e.g. - Appraised value or square footage

4 Cross-sectional - Observations in a sample are collected at the same time. e.g. - Our sample of 100 homes; most surveys Time series - Data is collected at successive points in time e.g. - Housing starts, recorded monthly from July 1985 to June 1997

5 Numerical Descriptive Measures: Notation N = Size of Population n = size of sample = Population Mean = sample mean = Population Variance = Population Standard Deviation s 2 = sample variance s = sample standard deviation

6 Sample Mean, where x i = i th observation, and n = sample size

7 Sample Variance, s 2

8 Example Find the mean and variance of the following sample values (in years): 3.4, 2.5, 4.1, 1.2, 2.8, 3.7

9 Random Variables Definition - A numerical variable whose value is determined by chance! e.g. - For a randomly selected house: Let X = Appraised value Y = Number of bedrooms W = Then X, Y, and W are all random variables. (Why?)

10 Note - Let X be a random variable, then, S 2 and S are also random variables What is the difference between X,, S 2, S and x,, s 2, s ?

11 Probability Distributions Definition - A probability distribution for a random variable describes the values that the random variable can assume together with the corresponding probabilities. Importance - Its probability distribution describes the behavior of a random variable. Therefore, questions concerning a random variable cannot be answered without reference to its probability distribution.

12 Mean, Std. dev. ,, Normal Distributions X density  -3  -2  -1  1  2  3  0 0.1 0.2 0.3 0.4

13 Empirical (68, 95, 99.7) Rule For a normally distributed random variable: i) Approx. 68% of the values lie within 1 standard deviation,  of the mean  i.e., P(  -  < X <  +  ) = 0.68 ii) Approx. 95% lie within 2  of   P(  -2  < X <  ) = 0.95 iii)Approx. 99.7% lie within 3  of . P(  -3  < X <  ) = 0.997

14 Standard Normal Distribution Mean, Std. dev. 0,1 Z density -3-20123 0 0.1 0.2 0.3 0.4

15 Examples Determine the following: a. P(0 1.46) c. P(1.28 < Z <1.46) d. P(Z < -1.28)

16 Solutions Using a table or Excel: a. P(0 1.46) = 0.5 - 0.4279 = 0.0721 c. P(1.28 < Z <1.46) = 0.4279 - 0.3997 = 0.0282 d. P(Z < -1.28) = P(1.28 < Z) = 0.5 - 0.3997 = 0.1003

17 Example Use a table or Excel, find and interpret z  P(Z > z 0.05 ) = 0.05 Ans. z 0.05 = 1.645 because P(Z > 1.645) = 0.05

18 z – scores and standardized random variables For a random variable X with mean  and standard deviation , the number of standard = deviations above or below the mean x is. is the Standardized Random Variable for X

19 the Distribution of (the Sampling Distribution of the Mean) Properties of : Let  = mean of all sample means of size n  = variance of all sample means of size n Then: i) =  ii) =

20 the Central Limit Theorem I.Central Limit Theorem - If a large sample is drawn randomly from any population, the distribution of the sample mean,, is at least approximately normal! II. Properties of 1. 2. 3. If X is normally distributed, then is normal regardless of the size of the sample!

21 Example (filling problem) Suppose that the amount of beer in a 16 oz bottle is normally distributed with a mean of 16.2 oz and a standard deviation of 0.3 oz. Find the probability that a customer buys a.one bottle and the bottle contains more than 16 oz. b.four bottles and the mean of the four is more than 16 oz.

22 Let X = amount of beer in a bottle. a. b.

23 Suppose you randomly selected 36 bottles and, after carefully measuring the amount of beer each contains, you determine that the mean amount for the sample is less than 16 oz. What would you conclude? Why?

24 Inference-Confidence Intervals Let X be a random variable with mean  and standard deviation . Suppose that X is normally distributed OR the a sample is large (n > 30), then is at least approximately normal with mean and standard deviation

25 A. Logic Mean,Std. dev. ,, Distribution of  0 0.1 0.2 0.3 0.4

26 A. Logic Mean,Std. dev. ,, Distribution of  0 0.1 0.2 0.3 0.4

27 A. Logic Mean,Std. dev. ,, Distribution of  0 0.1 0.2 0.3 0.4 0.6834

28 A. Logic Mean,Std. dev. ,, Distribution of  0 0.1 0.2 0.3 0.4

29 A. Logic Mean,Std. dev. ,, Distribution of  0 0.1 0.2 0.3 0.4 0.9544

30 A. Logic Mean,Std. dev. ,, Distribution of  0 0.1 0.2 0.3 0.4

31 A. Logic Mean,Std. dev. ,, Distribution of  0 0.1 0.2 0.3 0.4 0.9974

32 A. Logic Mean,Std. dev. , Distribution of  0 0.1 0.2 0.3 0.4 Area = 1 -  Area =  /2

33 Confidence Interval for  (  known) (when the Central Limit Theorem applies) A (1 -  )100% confidence interval for  is given by ==

34 Student’s t Distributions (for 1 and 30 degrees of freedom) 1 30 -6 -4 -20 2 4 6

35 The Distribution of when  is unknown If X is normally distributed with mean  then the Studentized Random Variable has a t Distribution with n - 1 degrees of freedom.

36 Student's t Distribution x density -4-3-201234 0 0.1 0.2 0.3 0.4 df = 5 Area = 1 -  Area = 

37 Confidence Interval for  unknown) (when X is normal or n > 30) A (1 -  )100% confidence interval for  is given by

38 Example The general manager of a fleet of taxis surveys taxi drivers to determine the number of miles traveled by a total of 41 randomly selected customers. If = 7.7 miles and s = 2.93 miles, estimate the mean distance traveled with 95% confidence.

39 Solution (1 -  )100% = 95%, therefore, 1 -  = 0.95, so  = 0.05 (and  /2 = 0.025) Since n = 41, we have n - 1 = 40 degrees of freedom. The critical value is, so a 95% CI for the mean distance traveled is given by or (6.78, 8.62)

40 Hypothesis Tests for  (  Known) Assumptions: X has mean  X is normally distributed OR the sample is large, i.e., n > 30

41

42

43

44

45 Example You own a factory producing sulfuric acid. The current output = 8,200 liters/hour, normally distributed. To test a new process, 16 hours of output are obtained with the following results: and Can we conclude that the new process is less efficient than the current process?

46 P - Values (Probability Values) Definition - The p-value is the smallest significance level at which you would reject H o. (the p-value represents a tail probability.) Using p-values in Hypothesis Tests: If p-value < , then Reject H o If p-value > , then  accept (fail to reject) H o We reject H o for small p-values!


Download ppt "Review of Basic Statistics. Definitions Population - The set of all items of interest in a statistical problem e.g. - Houses in Sacramento Parameter -"

Similar presentations


Ads by Google