1A.1 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Review Some Basic Statistical Concepts Appendix 1A.

Slides:



Advertisements
Similar presentations
CS433: Modeling and Simulation
Advertisements

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Chapter 5 Discrete Random Variables and Probability Distributions
1 MF-852 Financial Econometrics Lecture 3 Review of Probability Roy J. Epstein Fall 2003.
Lecture 2 Today: Statistical Review cont’d:
Chapter 2: Probability Random Variable (r.v.) is a variable whose value is unknown until it is observed. The value of a random variable results from an.
Chapter 4 Discrete Random Variables and Probability Distributions
Probability Densities
Discrete Random Variables and Probability Distributions
Class notes for ISE 201 San Jose State University
Chapter 6 Continuous Random Variables and Probability Distributions
Sections 4.1, 4.2, 4.3 Important Definitions in the Text:
Statistical Background
Continuous Random Variables and Probability Distributions
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Chapter 4: Joint and Conditional Distributions
Chapter 5 Continuous Random Variables and Probability Distributions
Review of Probability and Statistics
The joint probability distribution function of X and Y is denoted by f XY (x,y). The marginal probability distribution function of X, f X (x) is obtained.
Random Variable and Probability Distribution
Modern Navigation Thomas Herring
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
NIPRL Chapter 2. Random Variables 2.1 Discrete Random Variables 2.2 Continuous Random Variables 2.3 The Expectation of a Random Variable 2.4 The Variance.
Econ 482 Lecture 1 I. Administration: Introduction Syllabus Thursday, Jan 16 th, “Lab” class is from 5-6pm in Savery 117 II. Material: Start of Statistical.
Review of Probability.
SIMPLE LINEAR REGRESSION
Chapter 7: Random Variables
Lecture Note 5 Discrete Random Variables and Probability Distributions ©
Short Resume of Statistical Terms Fall 2013 By Yaohang Li, Ph.D.
Chapter 5 Discrete Random Variables and Probability Distributions ©
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
DATA ANALYSIS Module Code: CA660 Lecture Block 3.
Lecture 3 A Brief Review of Some Important Statistical Concepts.
Theory of Probability Statistics for Business and Economics.
Review of Probability Concepts ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes SECOND.
Review of Probability Concepts ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Probability(C14-C17 BVD) C16: Random Variables. * Random Variable – a variable that takes numerical values describing the outcome of a random process.
Variables and Random Variables àA variable is a quantity (such as height, income, the inflation rate, GDP, etc.) that takes on different values across.
Chapter 4 DeGroot & Schervish. Variance Although the mean of a distribution is a useful summary, it does not convey very much information about the distribution.
1 G Lect 2M Examples of Correlation Random variables and manipulated variables Thinking about joint distributions Thinking about marginal distributions:
CS433 Modeling and Simulation Lecture 03 – Part 01 Probability Review 1 Dr. Anis Koubâa Al-Imam Mohammad Ibn Saud University
1 Continuous Probability Distributions Continuous Random Variables & Probability Distributions Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Math 4030 – 6a Joint Distributions (Discrete)
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Lecture 3 1 Recap Random variables Continuous random variable Sample space has infinitely many elements The density function f(x) is a continuous function.
AP Statistics Chapter 16. Discrete Random Variables A discrete random variable X has a countable number of possible values. The probability distribution.
Random Variables. Numerical Outcomes Consider associating a numerical value with each sample point in a sample space. (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
Review of Probability Concepts Prepared by Vera Tabakova, East Carolina University.
Continuous Random Variables and Probability Distributions
Review Know properties of Random Variables
Virtual University of Pakistan Lecture No. 26 Statistics and Probability Miss Saleha Naghmi Habibullah.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
Function of a random variable Let X be a random variable in a probabilistic space with a probability distribution F(x) Sometimes we may be interested in.
Biostatistics Class 3 Probability Distributions 2/15/2000.
Statistics -S1.
Review Some Basic Statistical Concepts.
Keller: Stats for Mgmt & Econ, 7th Ed
Review of Probability Concepts
Some Basic Probability Concepts
How accurately can you (1) predict Y from X, and (2) predict X from Y?
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
AP Statistics Chapter 16 Notes.
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Discrete Random Variables and Probability Distributions
Chapter 5 Continuous Random Variables and Probability Distributions
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Mathematical Expectation
Presentation transcript:

1A.1 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Review Some Basic Statistical Concepts Appendix 1A

1A.2 Copyright© 1977 John Wiley & Son, Inc. All rights reserved random variable : A variable whose value is unknown until it is observed. The value of a random variable results from an experiment. The term random variable implies the existence of some known or unknown probability distribution defined over the set of all possible values of that variable. In contrast, an arbitrary variable does not have a probability distribution associated with its values. Random Variable

1A.3 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Controlled experiment values of explanatory variables are chosen with great care in accordance with an appropriate experimental design. Uncontrolled experiment values of explanatory variables consist of nonexperimental observations over which the analyst has no control.

1A.4 Copyright© 1977 John Wiley & Son, Inc. All rights reserved discrete random variable : A discrete random variable can take only a finite number of values, that can be counted by using the positive integers. Example: Prize money from the following lottery is a discrete random variable: first prize: $1,000 second prize: $50 third prize: $5.75 since it has only four (a finite number) (count: 1,2,3,4) of possible outcomes: $0.00; $5.75; $50.00; $1, Discrete Random Variable

1A.5 Copyright© 1977 John Wiley & Son, Inc. All rights reserved continuous random variable : A continuous random variable can take any real value (not just whole numbers) in at least one interval on the real line. Examples: Gross national product (GNP) money supply interest rates price of eggs household income expenditure on clothing Continuous Random Variable

1A.6 Copyright© 1977 John Wiley & Son, Inc. All rights reserved A discrete random variable that is restricted to two possible values (usually 0 and 1) is called a dummy variable (also, binary or indicator variable). Dummy variables account for qualitative differences: gender (0=male, 1=female), race (0=white, 1=nonwhite), citizenship (0=U.S., 1=not U.S.), income class (0=poor, 1=rich). Dummy Variable

1A.7 Copyright© 1977 John Wiley & Son, Inc. All rights reserved A list of all of the possible values taken by a discrete random variable along with their chances of occurring is called a probability function or probability density function (pdf). diexf(x) one dot11/6 two dots21/6 three dots31/6 four dots41/6 five dots51/6 six dots61/6

1A.8 Copyright© 1977 John Wiley & Son, Inc. All rights reserved A discrete random variable X has pdf, f(x), which is the probability that X takes on the value x. f(x) = P(X=x i ) 0 < f(x) < 1 If X takes on the n values: x 1, x 2,..., x n, then f(x 1 ) + f(x 2 )+...+f(x n ) = 1. For example in a throw of one dice: (1/6) + (1/6) + (1/6) + (1/6) + (1/6) + (1/6)=1 Therefore,

1A.9 Copyright© 1977 John Wiley & Son, Inc. All rights reserved In a throw of two dice, the discrete random variable X, x = f(x) =(1/36)(2/36)(3/36)(4/36)(5/36)(6/36)(5/36)( 4/36)(3/36)(2/36)(1/36) the pdf f(x) can be shown by the presented by height: X number, X, the possible outcomes of two dice f(x)

1A.10 Copyright© 1977 John Wiley & Son, Inc. All rights reserved A continuous random variable uses area under a curve rather than the height, f(x), to represent probability: f(x) X $34,000 $55,000.. per capita income, X, in the United States red area green area

1A.11 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Since a continuous random variable has an uncountably infinite number of values, the probability of one occurring is zero. P [ X = a ] = P [ a < X < a ] = 0 Probability is represented by area. Height alone has no area. An interval for X is needed to get an area under the curve.

1A.12 Copyright© 1977 John Wiley & Son, Inc. All rights reserved P [ a < X < b ] =   f(x) dx b a The area under a curve is the integral of the equation that generates the curve: For continuous random variables it is the integral of f(x), and not f(x) itself, which defines the area and, therefore, the probability.

1A.13 Copyright© 1977 John Wiley & Son, Inc. All rights reserved n Rule 2:   k x i = k   x i i = 1 n Rule 1:   x i = x 1 + x x n i = 1 n Rule 3:   x i +  y i  =   x i +   y i i = 1 nn n Note that summation is a linear operator which means it operates term by term. Rules of Summation

1A.14 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Rule 4:   ax i +  by i  = a   x i + b   y i i = 1 nn n Rules of Summation (continued) Rule 5: x =   x i = i = 1 n n 1 x 1 + x x n n The definition of x as given in Rule 5 implies the following important fact:   x i  x) = 0 i = 1 n

1A.15 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Rule 6:   f(x i ) = f(x 1 ) + f(x 2 ) f(x n ) i = 1 n Notation:   f(x i ) =   f(x i ) =   f(x i ) n x i i = 1 n Rule 7:    f(x i,y j ) =  [ f(x i,y 1 ) + f(x i,y 2 )+...+ f(x i,y m )] i = 1 n m j = 1 The order of summation does not matter :    f(x i,y j ) =    f(x i,y j ) i = 1 n m j = 1 m n i = 1 Rules of Summation (continued)

1A.16 Copyright© 1977 John Wiley & Son, Inc. All rights reserved The mean or arithmetic average of a random variable is its mathematical expectation or expected value, E(X). The Mean of a Random Variable

1A.17 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Expected Value There are two entirely different, but mathematically equivalent, ways of determining the expected value: 1. Empirically: The expected value of a random variable, X, is the average value of the random variable in an infinite number of repetitions of the experiment. In other words, draw an infinite number of samples, and average the values of X that you get.

1A.18 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Expected Value 2. Analytically: The expected value of a discrete random variable, X, is determined by weighting all the possible values of X by the corresponding probability density function values, f(x), and summing them up. E(X) = x 1 f(x 1 ) + x 2 f(x 2 ) x n f(x n ) In other words:

1A.19 Copyright© 1977 John Wiley & Son, Inc. All rights reserved In the empirical case when the sample goes to infinity the values of X occur with a frequency equal to the corresponding f(x) in the analytical expression. As sample size goes to infinity, the empirical and analytical methods will produce the same value. Empirical vs. Analytical

1A.20 Copyright© 1977 John Wiley & Son, Inc. All rights reserved x =   x i /n n i = 1 where n is the number of sample observations. Empirical (sample) mean: E(X) =   x i  f(x i ) i = 1 n where n is the number of possible values of x i. Analytical mean: Notice how the meaning of n changes.

1A.21 Copyright© 1977 John Wiley & Son, Inc. All rights reserved E (X) =  x i f(x i ) i=1 n The expected value of X-squared: E (X ) =  x i f(x i ) i=1 n 2 2 It is important to notice that f(x i ) does not change! The expected value of X-cubed: E (X )=  x i f(x i ) i=1 n 3 3 The expected value of X:

1A.22 Copyright© 1977 John Wiley & Son, Inc. All rights reserved E(X) = 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) + 4 (.1) = 1.9 = = E( X ) = 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) +4 (.1) = = 14.5

1A.23 Copyright© 1977 John Wiley & Son, Inc. All rights reserved E [g(X)] =  g ( x i ) f(x i ) n i = 1 g(X) = g 1 (X) + g 2 (X) E [g(X)] =  g 1 (x i ) + g 2 (x i )] f(x i ) n i = 1 E [g(X)] =  g 1 (x i ) f(x i ) +  g 2 (x i ) f(x i ) n i = 1 n E [g(X)] = E [g 1 (X)] + E [g 2 (X)]

1A.24 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Adding and Subtracting Random Variables E(X-Y) = E(X) - E(Y) E(X+Y) = E(X) + E(Y)

1A.25 Copyright© 1977 John Wiley & Son, Inc. All rights reserved E(X+a) = E(X) + a Adding a constant to a variable will add a constant to its expected value: Multiplying by constant will multiply its expected value by that constant: E(bX) = b E(X)

1A.26 Copyright© 1977 John Wiley & Son, Inc. All rights reserved var(X) = average squared deviations around the mean of X. var(X) = expected value of the squared deviations around the expected value of X. var(X) = E [(X - E(X)) ] 2 Variance

1A.27 Copyright© 1977 John Wiley & Son, Inc. All rights reserved var(X) = E [(X - EX) ] = E [X - 2XEX + (EX) ] = E(X ) - 2 EX EX + E (EX) 2 2 = E(X ) - 2 (EX) + (EX) 2 22 = E(X ) - (EX) 2 2 var(X) = E [(X - EX) ] 2 var(X) = E(X ) - (EX) 2 2

1A.28 Copyright© 1977 John Wiley & Son, Inc. All rights reserved variance of a discrete random variable, X: standard deviation is square root of variance var ( X ) = (xixi - EX) 2 f (xixi ) i= 1 n 

1A.29 Copyright© 1977 John Wiley & Son, Inc. All rights reserved x i f(x i ) (x i - EX) (x i - EX) f(x i ) = (.1) = = (.3) = = (.1) = =.7.49 (.2) = = (.3) =.867  x i f(x i ) = = 4.3  (x i - EX) f(x i ) = = calculate the variance for a discrete random variable, X: i = 1 n n

1A.30 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Z = a + cX var(Z) = var(a + cX) = E [(a+cX) - E(a+cX)] = c var(X) 2 2 var(a + cX) = c var(X) 2

1A.31 Copyright© 1977 John Wiley & Son, Inc. All rights reserved The covariance between two random variables, X and Y, measures the linear association between them. cov(X,Y) = E[(X - EX)(Y-EY)] Note that variance is a special case of covariance. cov(X,X) = var(X) = E[(X - EX) ] 2 Covariance

1A.32 Copyright© 1977 John Wiley & Son, Inc. All rights reserved cov(X,Y) = E [(X - EX)(Y-EY)] = E [XY - X EY - Y EX + EX EY] = E(XY) - 2 EX EY + EX EY = E(XY) - EX EY cov(X,Y) = E [(X - EX)(Y-EY)] cov(X,Y) = E(XY) - EX EY = E(XY) - EX EY - EY EX + EX EY

1A.33 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Y = 1 Y = 2 X = 0 X = EX=0(.60)+1(.40)=.40 EY=1(.50)+2(.50)=1.50 E(XY) = (0)(1)(.45)+(0)(2)(.15)+(1)(1)(.05)+(1)(2)(.35)=.75 EX EY = (.40)(1.50) =.60 cov(X,Y) = E(XY) - EX EY =.75 - (.40)(1.50) = =.15 covariance

1A.34 Copyright© 1977 John Wiley & Son, Inc. All rights reserved A joint probability density function, f(x,y), provides the probabilities associated with the joint occurrence of all of the possible pairs of X and Y. Joint pdf

1A.35 Copyright© 1977 John Wiley & Son, Inc. All rights reserved college grads in household joint pdf f(x,y) Y = 1 Y = 2 vacation homes owned X = 0 X = 1 Survey of College City, NY f (0,1) f (0,2) f (1,1) f (1,2)

1A.36 Copyright© 1977 John Wiley & Son, Inc. All rights reserved E[g(X,Y)] =   g(x i,y j ) f(x i,y j ) i j E(XY) = (0)(1)(.45)+(0)(2)(.15)+(1)(1)(.05)+(1)(2)(.35)=.75 E(XY) =   x i y j f(x i,y j ) i j Calculating the expected value of functions of two random variables.

1A.37 Copyright© 1977 John Wiley & Son, Inc. All rights reserved The marginal probability density functions, f(x) and f(y), for discrete random variables, can be obtained by summing over the f(x,y) with respect to the values of Y to obtain f(x) with respect to the values of X to obtain f(y). f(x i ) =  f(x i,y j ) f(y j ) =  f(x i,y j ) i j Marginal pdf

1A.38 Copyright© 1977 John Wiley & Son, Inc. All rights reserved marginal Y = 1 Y = 2 X = 0 X = f (X = 1) f (X = 0) f (Y = 1) f (Y = 2) marginal pdf for Y: marginal pdf for X:

1A.39 Copyright© 1977 John Wiley & Son, Inc. All rights reserved The conditional probability density functions of X given Y=y, f(x | y), and of Y given X=x, f(y | x), are obtained by dividing f(x,y) by f(y) to get f(x | y) and by f(x) to get f(y | x). f(x | y) = f(y | x) = f(x,y) f(y) f(x) Conditional pdf

1A.40 Copyright© 1977 John Wiley & Son, Inc. All rights reserved conditonal Y = 1 Y = 2 X = 0 X = f (Y=2 | X= 0)=.25 f (Y=1 | X = 0)=.75 f (Y=2 | X = 1)=.875 f (X=0 | Y=2)=.30 f (X=1 | Y=2)=.70 f (X=0 | Y=1)=.90 f (X=1 | Y=1)=.10 f (Y=1 | X = 1)=.125

1A.41 Copyright© 1977 John Wiley & Son, Inc. All rights reserved X and Y are independent random variables if their joint pdf, f(x,y), is the product of their respective marginal pdfs, f(x) and f(y). f(x i,y j ) = f(x i ) f(y j ) for independence this must hold for all pairs of i and j Independence

1A.42 Copyright© 1977 John Wiley & Son, Inc. All rights reserved not independent Y = 1 Y = 2 X = 0 X = f (X = 1) f (X = 0) f (Y = 1) f (Y = 2) marginal pdf for Y: marginal pdf for X:.50x.60=.30.50x.40=.20 The calculations in the boxes show the numbers required to have independence.

1A.43 Copyright© 1977 John Wiley & Son, Inc. All rights reserved The correlation between two random variables X and Y is their covariance divided by the square roots of their respective variances. Correlation is a pure number falling between -1 and 1. cov(X,Y)  (X,Y) = var(X) var(Y) Correlation

1A.44 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Y = 1 Y = 2 X = 0 X = EX=.40 EY=1.50 cov(X,Y) =.15 correlation EX=0(.60)+1(.40)= var(X) = E(X ) - ( EX) =.40 - (.40) = EY=1(.50)+2(.50) = = var(Y) = E(Y ) - ( EY) = (1.50) =  (X,Y) = cov(X,Y) var(X) var(Y)  (X,Y) =.61

1A.45 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Independent random variables have zero covariance and, therefore, zero correlation. The converse is not true. Zero Covariance & Correlation

1A.46 Copyright© 1977 John Wiley & Son, Inc. All rights reserved The expected value of the weighted sum of random variables is the sum of the expectations of the individual terms. Since expectation is a linear operator, it can be applied term by term. E[c 1 X + c 2 Y] = c 1 EX + c 2 EY E[c 1 X c n X n ] = c 1 EX c n EX n In general, for random variables X 1,..., X n :

1A.47 Copyright© 1977 John Wiley & Son, Inc. All rights reserved The variance of a weighted sum of random variables is the sum of the variances, each times the square of the weight, plus twice the covariances of all the random variables times the products of their weights. var(c 1 X + c 2 Y)=c 1 var(X)+c 2 var(Y) + 2c 1 c 2 cov(X,Y) 2 2 var(c 1 X  c 2 Y) = c 1 var(X)+c 2 var(Y)  2c 1 c 2 cov(X,Y) 2 2 Weighted sum of random variables: Weighted difference of random variables:

1A.48 Copyright© 1977 John Wiley & Son, Inc. All rights reserved The Normal Distribution Y ~ N( ,  2 ) f(y) = 2  22  2 1 exp  y f(y) 2  2 (y -  ) 2 -

1A.49 Copyright© 1977 John Wiley & Son, Inc. All rights reserved The Standardized Normal Z ~ N( ,  ) f(z) = 2 2  1 exp 2 z2z2 - Z = (y -  )/ 

1A.50 Copyright© 1977 John Wiley & Son, Inc. All rights reserved P [ Y > a ] = P > = P Z > a -  Y -      y f(y) a Y ~ N( ,  2 )

1A.51 Copyright© 1977 John Wiley & Son, Inc. All rights reserved P [ a < Y < b ] = P < < = P < Z < a -  Y -    b -   a -   b -    y f(y) a Y ~ N( ,  2 ) b

1A.52 Copyright© 1977 John Wiley & Son, Inc. All rights reserved Y 1 ~ N(  1,  1 2 ), Y 2 ~ N(  2,  2 2 ),..., Y n ~ N(  n,  n 2 ) W = c 1 Y 1 + c 2 Y c n Y n Linear combinations of jointly normally distributed random variables are themselves normally distributed. W ~ N [ E(W), var(W) ]

1A.53 Copyright© 1977 John Wiley & Son, Inc. All rights reserved mean: E[V] = E[  (m) ] = m If Z 1, Z 2,..., Z m denote m independent N(0,1) random variables, and V = Z 1 + Z Z m, then V ~  (m) V is chi-square with m degrees of freedom. Chi-Square variance: var[V] = var[  (m) ] = 2m If Z 1, Z 2,..., Z m denote m independent N(0,1) random variables, and V = Z 1 + Z Z m, then V ~  (m) V is chi-square with m degrees of freedom. 2 2

1A.54 Copyright© 1977 John Wiley & Son, Inc. All rights reserved mean: E[ t ] = E[ t (m) ] = 0 symmetric about zero variance: var[ t ] = var[ t (m) ] = m/(m - 2 ) If Z ~ N(0,1) and V ~  (m) and if Z and V are independent then, ~ t (m) t is student-t with m degrees of freedom. 2 t = Z V m Student - t

1A.55 Copyright© 1977 John Wiley & Son, Inc. All rights reserved If V 1 ~  (m 1 ) and V 2 ~  (m 2 ) and if V 1 and V 2 are independent, then ~ F (m 1,m 2 ) F is an F statistic with m 1 numerator degrees of freedom and m 2 denominator degrees of freedom. 2 F = V1V1 m1m1 V2V2 m2m2 2 F Statistic