# Stat 155, Section 2, Last Time Big Rules of Probability: –Not Rule ( 1 – P{opposite}) –Or Rule (glasses – football) –And rule (multiply conditional prob’s)

## Presentation on theme: "Stat 155, Section 2, Last Time Big Rules of Probability: –Not Rule ( 1 – P{opposite}) –Or Rule (glasses – football) –And rule (multiply conditional prob’s)"— Presentation transcript:

Stat 155, Section 2, Last Time Big Rules of Probability: –Not Rule ( 1 – P{opposite}) –Or Rule (glasses – football) –And rule (multiply conditional prob’s) –Use in combination for real power Bayes Rule –Turn around conditional probabilities –Write hard ones in terms of easy ones –Recall surprising disease testing result

Reading In Textbook Approximate Reading for Today’s Material: Pages 266-271, 311-323, 277-286 Approximate Reading for Next Class: Pages 291-305, 334-351

Midterm I Coming up: Tuesday, Feb. 27 Material: HW Assignments 1 – 6 Extra Office Hours: Mon. Feb. 26, 8:30 – 12:00, 2:00 – 3:30 (Instead of Review Session) Bring Along: 1 8.5” x 11” sheet of paper with formulas

Recall Pepsi Challenge In class taste test: Removed bias with randomization Double blind approach Asked which was: –Better –Sweeter –which

Recall Pepsi Challenge Results summarized in spreadsheetspreadsheet Eyeball impressions: a. Perhaps no consensus preference between Pepsi and Coke? –Is 54% "significantly different from 50%? (will develop methods to understand this) –Result of "marketing research"???

Recall Pepsi Challenge b. Perhaps no consensus as to which is sweeter? Very different from the past, when Pepsi was noticeably sweeter This may have driven old Pepsi challenge phenomenon Coke figured this out, and matched Pepsi in sweetness

Recall Pepsi Challenge c. Most people believe they know –Serious cola drinkers, because now flavor driven –In past, was sweetness driven, and there were many advertising caused misperceptions! d. People tend to get it right or not??? (less clear) –Overall 71% right. Seems like it, but again is that significantly different from 50%?

Recall Pepsi Challenge e. Those who think they know tend to be right??? –People who thought they knew: right 71% of the time f. Those who don't think they know seem to right as well. Wonder why? –People who didn't: also right 70% of time? Why? "Natural sampling variation"??? –Any difference between people who thought they knew, and those who did not think so?

Recall Pepsi Challenge g. Coin toss was fair (or is 57% heads significantly different from %50?) How accurate are those ideas? Will build tools to assess this Called “hypo tests” and “P-values” Revisit this example later

Independence (Need one more major concept at this level) An event A does not depend on B, when: Knowledge of B does not change chances of A: P{A | B} = P{A}

Independence E.g. I Toss a Coin, and somebody on South Pole does too. P{H(me) | T(SP)} = P{H(me)} = ½. (no way that can matter, i.e. independent)

Independence E.g. I Toss a Coin twice: (toss number indicated with subscript) Is it < ½? What if have 5 Heads in a row? (isn’t it more likely to get a Tail?) (Wanna bet?!?)

Independence E.g. I Toss a Coin twice, … Rational approach: Look at Sample Space Model all as equally likely Then: So independence is good model for coin tosses

New Ball & Urn Example H  R R R R G G T  R R G Again toss coin, and draw ball: Same, so R & H are independent events Not true above, but works here, since proportions of R & G are same

Independence Note, when A is independent of B: so And thus i.e. B is independent of A

Independence Note, when A in independent of B: It follows that: B is independent of A I.e. “independence” is symmetric in A and B (as expected) More formal treatments use symmetric version as definition (to avoid hassles with 0 probabilities)

Independence HW: 4.31

Special Case of “And” Rule For A and B independent: P{A & B} = P{A | B} P{B} = P{B | A} P{A} = = P{A} P{B} i.e. When independent, just multiply probabilities… Textbook: Call this another rule Me: Only learn one, this is a special case

Independent “And” Rule E.g. Toss a coin until the 1 st Head appears, find P{3 tosses}: Model: tosses are independent (saw this was reasonable last time, using equally likely sample space ideas) P{3 tosses} = When have 3: group with parentheses

Independent “And” Rule E.g. Toss a coin until the 1 st Head appears, find P{3 tosses} (by indep:) I.e. “just multiply”

Independent “And” Rule E.g. Toss a coin until the 1 st Head appears, P{3 tosses} Multiplication idea holds in general So from now on will just say: “Since Independent, multiply probabilities” Similarly for Exclusive Or rule, Will just “add probabilities”

Independent “And” Rule HW: 4.29 (hint: Calculate P{G 1 &G 2 &G 3 &G 4 &G 5 &G 6 &G 7 }) 4.33

Overview of Special Cases Careful: these can be tricky to keep separate OR works like adding, for mutually exclusive AND works like multiplying, for independent

Overview of Special Cases Caution: special cases are different Mutually exclusive independent For A and B mutually exclusive: P{A | B} = 0 P{A} Thus not independent

Overview of Special Cases HW: C15 Suppose events A, B, C all have probability 0.4, A & B are independent, and A & C are mutually exclusive. (a)Find P{A or B} (0.64) (b)Find P{A or C} (0.8) (c)Find P{A and B} (0.16) (d)Find P{A and C} (0)

Random Variables Text, Section 4.3 (we are currently jumping) Idea: take probability to next level Needed for probability structure of political polls, etc.

Random Variables Definition: A random variable, usually denoted as X, is a quantity that “takes on values at random”

Random Variables Two main types (that require different mathematical models) Discrete, i.e. counting (so look only at “counting numbers”, 1,2,3,…) Continuous, i.e. measuring (harder math, since need all fractions, etc.)

Random Variables E.g: X = # for Candidate A in a randomly selected political poll: discrete (recall all that means) Power of the random variable idea: Gives something to “get a hold of…” Similar in spirit to high school algebra…

High School Algebra Recall Main Idea? Rules for solving equations??? No, major breakthrough is: Give unknown(s) a name Find equation(s) with unknown Solve equation(s) to find unknown(s)

Random Variables E.g: X = # that comes up, in die rolling: Discrete But not very interesting Since can study by simple methods As done above Don’t really need random variable concept

Random Variables E.g: Measurement error: Let X = measurement: Continuous How to model probabilities???

Random Variables HW on discrete vs. continuous: 4.40 ((b) discrete, (c) continuous, (d) could be either, but discrete is more common)

And now for something completely different My idea about “visualization” last time: 30% really liked it 70% less enthusiastic… Depends on mode of thinking –“Visual thinkers” loved it –But didn’t connect with others So hadn’t planned to continue that…

And now for something completely different But here was another viewpoint: Professor Marron, Could you focus on something more intelligent in your "And now for something completely different" section once every two weeks, perhaps, instead of completely abolishing it? I really enjoyed your discussion of how to view three dimensions in 2-D today.

And now for something completely different A fun example: Faces as data Each data point is a digital image Data from U. Carlos, III in Madrid (hard to do here for confidentiality reasons) Q: What distinguishes men from women?

And now for something completely different

Context: statistical problem of “classification”, i.e. “discrimination” Basically “automatic disease diagnosis”: Have measurm’ts on sick & healthy cases Given new person, make measm’ts Closest to sick or healthy populations?

And now for something completely different Approach: Distance Weight Discrimination (Marron & Todd) Idea: find “best separating direction” in high dimensional data space Here: Data are images Classes: Male & Females Given new image: classify make - female

And now for something completely different Fun visualization: March through point clouds Along separating direction Captures “Femaleness” & “Maleness” Note relation to “training data”

And now for something completely different

Random Variables A die rolling example (where random variable concept is useful) Win \$9 if 5 or 6, Pay \$4, if 1, 2 or 3, otherwise (4) break even Notes: Don’t care about number that comes up Random Variable abstraction allows focusing on important points Are you keen to play? (will calculate…)

Random Variables Die rolling example Win \$9 if 5 or 6, Pay \$4, if 1, 2 or 4 Let X = “net winnings” Note: X takes on values 9, -4 and 0 Probability Structure of X is summarized by: P{X = 9} = 1/3 P{X = -4} = 1/2 P{X = 0} = 1/6 (should you want to play?, study later)

Random Variables Die rolling example, for X = “net winnings”: Win \$9 if 5 or 6, Pay \$4, if 1, 2 or 4 Probability Structure of X is summarized by: P{X = 9} = 1/3 P{X = -4} = 1/2 P{X = 0} = 1/6 Convenient form: a table Winning9-40 Prob.1/31/21/6

Summary of Prob. Structure In general: for discrete X, summarize “distribution” (i.e. full prob. Structure) by a table: Where: i.All are between 0 and 1 ii. (so get a prob. funct’n as above) Valuesx1x1 x2x2 …xkxk Prob.p1p1 p2p2 …pkpk

Summary of Prob. Structure Summarize distribution, for discrete X, by a table: Power of this idea: Get probs by summing table values Special case of disjoint OR rule Valuesx1x1 x2x2 …xkxk Prob.p1p1 p2p2 …pkpk

Summary of Prob. Structure E.g. Die Rolling game above: P{X = 9} = 1/3 P{X < 2} = P{X = 0} + P{X = -4} =1/6+1/2 = 2/3 P{X = 5} = 0 (not in table!) Winning9-40 Prob.1/31/21/6

Summary of Prob. Structure E.g. Die Rolling game above: Winning9-40 Prob.1/31/21/6

Summary of Prob. Structure HW: 4.41 & (c) Find P{X = 3 | X >= 2} (3/7) 4.52 (0.144, …, 0.352)

Probability Histogram Idea: Visualize probability distribution using a bar graph E.g. Die Rolling game above: Winning9-40 Prob.1/31/21/6

Probability Histogram Construction in Excel: Very similar to bar graphs (done before) Bar heights = probabilities Example: Class Example 18Class Example 18

Probability Histogram HW: 4.43

Random Variables Now consider continuous random variables Recall: for measurements (not counting) Model for continuous random variables: Calculate probabilities as areas, under “probability density curve”, f(x)

Continuous Random Variables Model probabilities for continuous random variables, as areas under “probability density curve”, f(x): = Area( ) a b (calculus notation)

Continuous Random Variables Note: Same idea as “idealized distributions” above Recall discussion from: Page 8, of Class Notes, Jan. 23Class Notes, Jan. 23

Continuous Random Variables e.g. Uniform Distribution Idea: choose random number from [0,1] Use constant density: f(x) = C Models “equally likely” To choose C, want: Area 1 = P{X in [0,1]} = C So want C = 1. 0 1

Uniform Random Variable HW: 4.54 (0.73, 0, 0.73, 0.2, 0.5) 4.56 (1, ½, 1/8)

Continuous Random Variables e.g. Normal Distribution Idea: Draw at random from a normal population f(x) is the normal curve (studied above) Review some earlier concepts:

Normal Curve Mathematics The “normal density curve” is: usual “function” of circle constant = 3.14… natural number = 2.7…

Normal Curve Mathematics Main Ideas: Basic shape is: “Shifted to mu”: “Scaled by sigma”: Make Total Area = 1: divide by as, but never

Computation of Normal Areas EXCEL Computation: works in terms of “lower areas” E.g. for Area < 1.3

Computation of Normal Probs EXCEL Computation: probs given by “lower areas” E.g. for X ~ N(1,0.5) P{X < 1.3} = 0.73

Normal Random Variables As above, compute probabilities as areas, In EXCEL, use NORMDIST & NORMINV E.g. above: X ~ N(1,0.5) P{X < 1.3} =NORMDIST(1.3,1,0.5,TRUE) = 0.73 (as in pic above)

Normal Random Variables HW: 4.57, 4.58 (0.965, ~0)

Download ppt "Stat 155, Section 2, Last Time Big Rules of Probability: –Not Rule ( 1 – P{opposite}) –Or Rule (glasses – football) –And rule (multiply conditional prob’s)"

Similar presentations