Presentation is loading. Please wait.

Presentation is loading. Please wait.

Variables and Random Variables àA variable is a quantity (such as height, income, the inflation rate, GDP, etc.) that takes on different values across.

Similar presentations


Presentation on theme: "Variables and Random Variables àA variable is a quantity (such as height, income, the inflation rate, GDP, etc.) that takes on different values across."— Presentation transcript:

1

2 Variables and Random Variables àA variable is a quantity (such as height, income, the inflation rate, GDP, etc.) that takes on different values across individuals, families, nations, months, quarters, etc. A constant, on the other hand, does not vary--e.g., the number of heads on a person. àA random variable is a type of variable which has its value determined at least in part by the element of chance

3 Measures of Central Tendency àThe mode, median, and mean are measures of the central tendency of a random variable such as the height of males. If the statement is made with respect to this variable that “the mode is 5'10",” it means that most common height (or the height which occurs with the greatest frequency) among males is 5'10". àThe median is the value of the random variable such that half the observations are above it and half below it. To say that “median family income in the U.S. is $38,450" is to say that half of U.S. households have an income below that figure and half above it.

4 à The population mean (symbolized by the Greek letter µ) is the average value of the variable for the population. Let m denote the number of observations (corresponding to the size of the population). Thus, we have: àSuppose we want to know the average height of adult males in the U.S. The practical approach would be to measure a representative sample (meaning, for example, that basketball players would not be disproportionately represented in the sample) of the population rather than the entire population. That is, we estimate the population mean by calculating a sample mean (  ). Let n be the number of observations in our sample. Thus we have:

5 Measures of Dispersion àOften we are interested in looking at the degree of dispersion of a random variable about its mean value. That is, are our observations of adult male height all bunched up around the mean or do we have wide dispersion about the mean? The population variance (  2 ) is a measure of the dispersion of a random variable. The variance of random variable X is defined as:

6 àIf we observe only a representative sample of the population, then : (1) µ is unknown; and (2) all the X i ’ s are not known. Thus, we estimate  2 by substituting  for µ and summing across our sample observations of X This is called a sample variance (s 2 ): àNote that we must divide through by n - 1 to obtain an unbaised estimate of  2 --that is s 2 is an unbaised estimator of  2 if E(s 2 ) =  2 àThe population standard deviation (  ) is given by the square root of the population variance ( 2 ). You can think of as the “average deviation from the mean.” In the case of male adult height, one would like to see that measure expressed in inches--hence we take the square root of the variance. àSimilarly, the sample standard deviation (s) is given by the square root of the sample variance (s 2 ).

7 Probability Distributions àThe probability density function of variable X is constructed such that, for any interval (a, b), the probability that X takes on a value in that interval is the total area under the curve between a and b. Expressed in terms of integral calculus, we have:

8 ab X P(X) Area under curve represents probability You should be familiar with this diagram

9 àThe normal distribution is probability density function which is symmetric about the mean--i.e., the left-hand side of the distribution is a mirror image of the right-hand side. The formula for the normal probability density function is given by:

10 The normal distribution   22 -2  -  68.27% 95.45%

11 àA random variable Z is said to be standard normal if it is normally distributed with mean of zero or and a variance of 1. If X is normally distributed with mean µ and variance  2, we abbreviate with the expression: X ~ N( ,  2 ) àThus, the expression used to indicate that the distribution of Z is standard normal is: Z ~ N(0, 1)

12 The standard normal distribution a0 P(Z) Pr(Z > a) when Z ~ N(0, 1) àFor example: àIf a = 1.93, then Pr(Z  a ) = 0.1093 àAnd Pr(Z  a ) 1 - 0.1093 = 0.8907

13 Correlation of Random Variables  To say that random variables X and Y are correlated is to say that changes in X are associated with changes in Y in the probabilistic or statistical sense. However, this does not necessarily mean that a change in X was the cause of a change in Y, or vice-versa. That is, “correlation does not imply causality.” àTechnically speaking, the statement “X and Y are positively correlated” means that the covariance between random variables X and Y is positive (or greater than zero).

14 E(Y) E(X) 0 Y X 21 34 X and Y are positively correlated random variables 1, X > E(X) and Y > E(Y) 2, X E(Y) 3, X E(X) and Y < E(Y)

15 àThe sample covariance between X and Y (i.e., our estimate of the covariance when we do not observe the entire populations of X’s or Y’s) is given by the following formula (the “hat” indicates an estimate): àThe covariance is positive if above average values of X tend to be paired with above average values of Y, and vice versa. The covariance is negative (and hence the variables are negatively correlated) if below average values of X tend to be paired with above average values of Y, and vice-versa. The magnitude of the covariance depends partly on the unit of measurement. Hence, we cannot depend on the size of the covariance to give an accurate measure of the strength of the relationship

16 àThe correlation coefficient (  ) is a unit-free measure of correlation. The sample correlation coefficient is given by: àIt will always be the case that: -1    1. àIf  = 1, there is a perfect positive ( linear) correlation between X and Y. If  = -1, there is a perfect negative (linear) correlate between X and Y.


Download ppt "Variables and Random Variables àA variable is a quantity (such as height, income, the inflation rate, GDP, etc.) that takes on different values across."

Similar presentations


Ads by Google