Chapter 4 Probability: Studying Randomness. Randomness and Probability Random: Process where the outcome in a particular trial is not known in advance,

Chapter 4 Probability: Studying Randomness

Randomness and Probability Random: Process where the outcome in a particular trial is not known in advance, although a distribution of outcomes may be known for a long series of repetitions Probability: The proportion of time a particular outcome will occur in a long series of repetitions of a random process Independence: When the outcome of one trial does not effect probailities of outcomes of subsequent trials

Probability Models Probability Model: –Listing of possible outcomes –Probability corresponding to each outcome Sample Space (S): Set of all possible outcomes of a random process Event: Outcome or set of outcomes of a random process (subset of S) Venn Diagram: Graphic description of a sample space and events

Rules of Probability The probability of an event A, denoted P(A) must lie between 0 and 1 (0  P(A)  1) For the sample space S, P(S)=1 Disjoint events have no common outcomes. For 2 disjoint events A and B, P(A or B) = P(A) + P(B) The complement of an event A is the event that A does not occur, denoted A c. P(A)+P(A c ) = 1 The probability of any event A is the sum of the probabilities of the individual outcomes that make up the event when the sample space is finite

Assigning Probabilities to Events Assign probabilities to each individual outcome and add up probabilities of all outcomes comprising the event When each outcome is equally likely, count the number of outcomes corresponding to the event and divide by the total number of outcomes Multiplication Rule: A and B are independent events if knowledge that one occurred does not effect the probability the other has occurred. If A and B are independent, then P(A and B) = P(A)P(B) Multiplication rule extends to any finite number of events

Example - Casualties at Gettysburg Results from Battle of Gettysburg Counts Proportions Killed, Wounded, Captured/Missing are considered casualties, what is the probability a randomly selected Northern soldier was a casualty? A Southern soldier? Obtain the distribution across armies

Random Variables Random Variable (RV): Variable that takes on the value of a numeric outcome of a random process Discrete RV: Can take on a finite (or countably infinite) set of possible outcomes Probability Distribution: List of values a random variable can take on and their corresponding probabilities –Individual probabilities must lie between 0 and 1 –Probabilities sum to 1 Notation: –Random variable: X –Values X can take on: x 1, x 2, …, x k –Probabilities: P(X=x 1 ) = p 1 … P(X=x k ) = p k

Example: Wars Begun by Year (1482-1939) Distribution of Numbers of wars started by year X = # of wars stared in randomly selected year Levels: x 1 =0, x 2 =1, x 3 =2, x 4 =3, x 5 =4 Probability Distribution:

Masters Golf Tournament 1st Round Scores

Continuous Random Variables Variable can take on any value along a continuous range of numbers (interval) Probability distribution is described by a smooth density curve Probabilities of ranges of values for X correspond to areas under the density curve –Curve must lie on or above the horizontal axis –Total area under the curve is 1 Special case: Normal distributions

Means and Variances of Random Variables Mean: Long-run average a random variable will take on (also the balance point of the probability distribution) Expected Value is another term, however we really do not expect that a realization of X will necessarily be close to its mean. Notation: E(X) Mean of a discrete random variable:

Examples - Wars & Masters Golf  =0.67  =73.54

Statistical Estimation/Law of Large Numbers In practice we won’t know  but will want to estimate it We can select a sample of individuals and observe the sample mean: By selecting a large enough sample size we can be very confident that our sample mean will be arbitrarily close to the true parameter value Margin of error measures the upper bound (with a high level of confidence) in our sampling error. It decreases as the sample size increases

Rules for Means Linear Transformations: a + bX (where a and b are constants): E(a+bX) =  a+bX = a + b  X Sums of random variables: X + Y (where X and Y are random variables): E(X+Y) =  X+Y =  X +  Y Linear Functions of Random Variables: E(a 1 X 1 +  +a n X n ) = a   1 +…+a n  n where E(X i )=  i

Example: Masters Golf Tournament Mean by Round (Note ordering):  1 =73.54  2 =73.07  3 =73.76  4 =73.91 Mean Score per hole (18) for round 1: E((1/18)X 1 ) = (1/18)  1 = (1/18)73.54 = 4.09 Mean Score versus par (72) for round 1: E(X 1 -72) =  X1-72 = 73.54-72= +1.54 (1.54 over par) Mean Difference (Round 1 - Round 4): E(X 1 -X 4 ) =  1 -  4 = 73.54 - 73.91 = -0.37 Mean Total Score: E(X 1 +X 2 +X 3 +X 4 ) =  1 +  2 +  3 +  4 = = 73.54+73.07+73.76+73.91 = 294.28 (6.28 over par)

Variance of a Random Variable Variance: Measure of the spread of the probability distribution. Average squared deviation from the mean Standard Deviation: (Positive) Square Root of Variance Rules for Variances (X, Y RVs a, b constants)

Variance of a Random Variable Special Cases: X and Y are independent (outcome of one does not alter the distribution of the other):  = 0, last term drops out a=b=1 and  = 0 V(X+Y) =  X 2 +  Y 2 a=1 b= -1 and  = 0 V(X-Y) =  X 2 +  Y 2 a=b=1 and   0 V(X+Y) =  X 2 +  Y 2 + 2  X  Y a=1 b= -1 and   0 V(X-Y) =  X 2 +  Y 2 -2  X  Y

Wars & Masters (Round 1) Golf Scores  2 =.7362  =.8580  2 =9.47 

Masters Scores (Rounds 1 & 4)  1 = 73.54  4 = 73.91  1 2 =9.48  4 2 =11.95  =0.24 Variance of Round 1 scores vs Par: V(X 1 -72)=  1 2 =9.48 Variance of Sum and Difference of Round 1 and Round 4 Scores:

General Rules of Probability Union of set of events: Event that any (at least one) of the events occur Disjoint events: Events that share no common sample points. If A, B, and C are pairwise disjoint, the probability of their union is: P(A)+P(B)+P(C) Intersection of two (or more) events: The event that both (all) events occur. Addition Rule: P(A or B) = P(A)+P(B)-P(A and B) Conditional Probability: The probability B occurs given A has occurred: P(B|A) Multiplication Rule (generalized to conditional prob): P(A and B)=P(A)P(B|A)=P(B)P(A|B)

Conditional Probability Generally interested in case that one event precedes another temporally (but not necessary) When P(A) > 0 (otherwise is trivial): Contingency Table: Table that cross-classifies individuals or probabilities across 2 or more event classifications Tree Diagram: Graphical description of cross-classification of 2 or more events

John Snow London Cholera Death Study 2 Water Companies (Let D be the event of death): –Southwark&Vauxhall (S): 264913 customers, 3702 deaths –Lambeth (L): 171363 customers, 407 deaths –Overall: 436276 customers, 4109 deaths Note that probability of death is almost 6 times higher for S&V customers than Lambeth customers (was important in showing how cholera spread)

John Snow London Cholera Death Study Contingency Table with joint probabilities (in body of table) and marginal probabilities (on edge of table)

John Snow London Cholera Death Study WaterUser S&V L.6072.3928 Company Death D (.0085).0140.9860 D C (.5987).0024.9976 D (.0009) D C (.3919) Tree Diagram obtaining joint probabilities by multiplication rule

Example: Florida lotto You select 6 distinct digits from 1 to 53 (no replacement) State randomly draws 6 digits from 1 to 53 Probability you match all 6 digits: –First state draw: P(match 1 st ) = 6/53 –Given you match 1 st, you have 5 left and state has 52 left: P(match 2 nd given matched 1 st ) = 5/52 –Process continues: P(match 3 rd given 1&2) = 4/51 –P(match 4 th given 1&2&3) = 3/50 –P(match 5 th given 1&2&3&4) = 2/49 –P(match 6 th given 1&2&3&4) = 1/48

Bayes’s Rule - Updating Probabilities Let A 1,…,A k be a set of events that partition a sample space such that (mutually exclusive and exhaustive): –each set has known P(A i ) > 0 (each event can occur) –for any 2 sets A i and A j, P(A i and A j ) = 0 (events are disjoint) –P(A 1 ) + … + P(A k ) = 1 (each outcome belongs to one of events) If C is an event such that –0 < P(C) < 1 (C can occur, but will not necessarily occur) –We know the probability will occur given each event A i : P(C|A i ) Then we can compute probability of A i given C occurred:

Northern Army at Gettysburg Regiments: partition of soldiers (A 1,…,A 9 ). Casualty: event C P(A i ) = (size of regiment) / (total soldiers) = (Column 3)/95369 P(C|A i ) = (# casualties) / (regiment size) = (Col 4)/(Col 3) P(C|A i ) P(A i ) = P(A i and C) = (Col 5)*(Col 6) P(C)=sum(Col 7) P(A i |C) = P(A i and C) / P(C) = (Col 7)/.2416

Independent Events Two events A and B are independent if P(B|A)=P(B) and P(A|B)=P(A), otherwise they are dependent or not independent. Cholera Example: P(D) =.0094 P(D|S) =.0140 P(D|L) =.0024 Not independent (which firm would you prefer)? Union Army Example: P(C) =.2416 P(C|A 1 )=.6046 P(C|A 5 )=.0156 Not independent: Almost 40 times higher risk for A 1

Chapter 4 Probability: Studying Randomness. Randomness and Probability Random: Process where the outcome in a particular trial is not known in advance,

Similar presentations

Presentation on theme: "Chapter 4 Probability: Studying Randomness. Randomness and Probability Random: Process where the outcome in a particular trial is not known in advance,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 4 Probability: Studying Randomness. Randomness and Probability Random: Process where the outcome in a particular trial is not known in advance,

Similar presentations

Presentation on theme: "Chapter 4 Probability: Studying Randomness. Randomness and Probability Random: Process where the outcome in a particular trial is not known in advance,"— Presentation transcript:

Similar presentations

About project

Feedback