# 2-1. 2-2 For exams (MD1, MD2, and Final): You may bring one 8.5” by 11” sheet of paper with formulas and notes written or typed on both sides to each.

## Presentation on theme: "2-1. 2-2 For exams (MD1, MD2, and Final): You may bring one 8.5” by 11” sheet of paper with formulas and notes written or typed on both sides to each."— Presentation transcript:

2-1

2-2 For exams (MD1, MD2, and Final): You may bring one 8.5” by 11” sheet of paper with formulas and notes written or typed on both sides to each exam.

2-3 Types of Data Quantitative data are measurements that are recorded on a naturally occurring numerical scale. Qualitative data are measurements that cannot be measured on a natural numerical scale; they can only be classified into one of a group of categories.

2-4 Data Presentation Qualitative Data Quantitative Data Summary Table Stem-&-Leaf Display Frequency Distribution Histogram Bar Graph Pie Chart Pareto Diagram Dot Plot

2-5 Example 1.722.52.162.131.062.242.312.031.091.40 2.572.641.262.051.192.131.271.512.411.95 stemLeaf unit=0.01 1.069 1.19 1.267 1.40 1.51 1.72 1.95 2.035 2.1336 2.24 2.31 2.41 2.507 2.64 stemLeaf unit=0.1 1001 122 145 17 19 200111 223 2455 26

2-6 Example 1.722.52.162.131.062.242.312.031.091.40 2.572.641.262.051.192.131.271.512.411.95

2-7 Example 1.722.52.162.131.062.242.312.031.091.40 2.572.641.262.051.192.131.271.512.411.95

2-8 Two Characteristics The central tendency of the set of measurements–that is, the tendency of the data to cluster, or center, about certain numerical values. Central Tendency (Location) Center

2-9 Two Characteristics The variability of the set of measurements–that is, the spread of the data, spread around the mean. Variation (Dispersion) Sample A Sample B Variation of Sample B Variation of Sample A

2-10 Mean 1.Most common measure of central tendency 2.Acts as ‘balance point’ 3.Affected by extreme values (‘outliers’) 4.Denoted where x x n xxx n i i n n     1 12 … x Sample mean

2-11 Median 1.Measure of central tendency 2.Middle value in ordered sequence If n is odd, middle value of sequence If n is even, average of 2 middle values 3.Position of median in sequence 4.Not affected by extreme values Positioning Point  n1 2

2-12 Median Example Even-Sized Sample Raw Data:10.34.98.911.76.37.7 Ordered:4.96.37.78.910.311.7 Position:123456 Positioning Point Median         n1 2 61 2 35 7789 2 830....

2-13 Mode Example No Mode Raw Data:10.34.98.911.76.37.7 One Mode Raw Data:6.34.98.9 6.3 4.94.9 More Than 1 Mode Raw Data:212828414343

2-14 Shape Describes how data are distributed A data set is said to be skewed if one tail of the distribution has more extreme observations than the other tail. Right-SkewedLeft-SkewedSymmetric Mean = Median Mean Median Median Mean Mode Mode = Mode

2-15 Example Mean=45 Median=68 Mode=94 Is this data-set skewed? If it is, which direction is the skewness?

2-16 Example Mean=45 Median=68 Mode=94 Skewed to the left.

2-17 Sample Variance Formula n – 1 in denominator!

2-18 A shortcut formula for variance

2-19 Sample Standard Deviation Formula

2-20 Thinking Challenge 1 Why do we need to take square root of variance to have a meaningful measure? Otherwise we would have a squared unit.

2-21 Interpreting Standard Deviation: Chebyshev’s Theorem No useful information At least 3/4 of the dataAt least 8/9 of the data

2-22 Interpreting Standard Deviation: Empirical Rule  – 3   – 2   –    –   – 2   – 3  Approximately 68% of the measurementsApproximately 95% of the measurements Approximately 99.7% of the measurements

2-23 Empirical Rule Example Approximately 95% of the data will lie in the interval (x – 2s, x + 2s), (15.5 – 2∙3.34, 15.5 + 2∙3.34) = (8.82, 22.18) Approximately 99.7% of the data will lie in the interval (x – 3s, x + 3s), (15.5 – 3∙3.34, 15.5 + 3∙3.34) = (5.48, 25.52) According to the Empirical Rule, approximately 68% of the data will lie in the interval (x – s, x + s), (15.5 – 3.34, 15.5 + 3.34) = (12.16, 18.84)

2-24 Numerical Measures of Relative Standing: Percentiles Describes the relative location of a measurement compared to the rest of the data Descriptive measures of the relationship of a measurement to the rest of the data are called measures of relative standing. The p th percentile is a number such that p% of the data falls below it and (100 – p)% falls above it Median = 50 th percentile

2-25 Percentile Example You scored 560 on the GMAT exam. This score puts you in the 58 th percentile. What percentage of test takers scored lower than you did? What percentage of test takers scored higher than you did?

2-26 Percentile Example What percentage of test takers scored lower than you did? 58% of test takers scored lower than 560. What percentage of test takers scored higher than you did? (100 – 58)% = 42% of test takers scored higher than 560.

2-27 Quartiles Percentiles that partition a data set into four categories, each category contains exactly 25 percent of the measurements, are called quartiles.

2-28 Example 1.061.091.191.261.271.41.511.721.952.03 2.052.13 2.162.242.312.412.52.572.64 Position for median=21/2=10.5 Median=(2.03+2.05)/2=2.04 Q1=median of the first half with position=5.5→Q1=(1.27+1.4)/2  1.3 Q3=median of the second half with position=5.5→Q3=(2.24+2.31)/2  2.3

2-29 Numerical Measures of Relative Standing: z–Scores Describes the relative location of a measurement (x) compared to the rest of the data Measures the number of standard deviations away from the mean a data value is located Sample z–scorePopulation z–score

2-30 The value of z-score reflects the relative standing of the measurement. A large positive z-score implies that the measurement is larger than almost all other measurements. A large value in negative magnitude indicates that the measurements is smaller than almost all other measurements. z score near 0 or is 0 means the measurement is located at or near the mean of the sample or population.

2-31 Interpretation of z–Scores

2-32 Box Plot Q1 Q3 Q2 The most extreme observation smaller than upper inner fence(Q3+IQR*1.5=3.4)=2.64 The most extreme observation bigger than upper inner fence(Q1-IQR*1.5=-0.2 )=1.1 1.061.091.191.261.271.41.511.721.952.03 2.052.13 2.162.242.312.412.52.572.64

2-33 Box Plot 3.A second pair of fences, the outer fences, are defined at a distance of 3(IQR) from the hinges. One symbol (*) represents measurements falling between the inner and outer fences, and another (0) represents measurements beyond the outer fences. 4.Symbols that represent the median and extreme data points vary depending on software used. You may use your own symbols if you are constructing a box plot by hand.

2-34 Outlier An observation (or measurement) that is unusually large or small relative to the other values in a data set is called an outlier. Outliers typically are attributable to one of the following causes: 1.The measurement is observed, recorded, or entered into the computer incorrectly. 2.The measurement comes from a different population. 3.The measurement is correct but represents a rare (chance) event.

2-35 Key Ideas Rules for Detecting Quantitative Outliers Method Suspect Highly Suspect Values between inner and outer fences 2 < |z| < 3 Box plot: z-score Values beyond outer fences |z| > 3

2-36 Experiments & Sample Spaces 1.Experiment Process of observation that leads to a single outcome that cannot be predicted with certainty 2.Sample point Most basic outcome of an experiment 3.Sample space ( S ) Collection of all sample points Sample Space Depends on Experimenter!

2-37 Visualizing Sample Space 1. Listing for the experiment of tossing a coin once and noting up face S = {Head, Tail} Sample point 2.A pictorial method for presenting the sample space Venn Diagram H T S

2-38 Example Experiment: Tossing two coins and recording up faces: Is sample space as below? S={HH, HT, TT}

2-39 Tree Diagram 1 st coin H T H TH T 2 nd coin

2-40 Sample Space Examples Toss a Coin, Note Face{Head, Tail} Toss 2 Coins, Note Faces{HH, HT, TH, TT} Select 1 Card, Note Kind {2♥, 2♠,..., A♦} (52) Select 1 Card, Note Color{Red, Black} Play a Football Game{Win, Lose, Tie} Inspect a Part, Note Quality{Defective, Good} Observe Gender{Male, Female} Experiment Sample Space

2-41 Events 1. Specific collection of sample points 2. Simple Event Contains only one sample point 3. Compound Event Contains two or more sample points

2-42 What is Probability? 1.Numerical measure of the likelihood that event will occur P(Event) P(A) Prob(A) 2.Lies between 0 & 1 3.Sum of probabilities for all sample points in the sample space is 1 1.5.5 0 CertainCertain ImpossibleImpossible

2-43 Equally Likely Probability P(Event) = X / T X = Number of outcomes in the event T = Total number of sample points in Sample Space Each of T sample points is equally likely — P(sample point) = 1/T © 1984-1994 T/Maker Co.

2-44 Thinking Challenge (sol.) Consider rolling two fair dice. Let event A=Having the sum of upfaces 6 or less. So, A={ (1,1), (1,2), (1,3), (1,4), (1,5), (2,1), (2,2), (2,3), (2,4), (3,1), (3,2),(3,3), (4,1), (4,2), (5,1)} each with prob.1/36 P(A)=15/36=5/12

2-45 Combinations Rule A sample of n elements is to be drawn from a set of N elements. The, the number of different samples possible is denoted byand is equal to where the factorial symbol (!) means that n!=n*(n-1)*…*3*2*1 For example,0! is defined to be 1.

2-46 Thinking Challenge The price of a european tour includes four stopovers to be selected from among 10 cities. In how many different ways can one plan such a tour if the order of the stopovers does not matter?

2-47 Unions & Intersections 1. Union Outcomes in either events A or B or both ‘OR’ statement Denoted by  symbol (i.e., A  B) 2. Intersection Outcomes in both events A and B ‘AND’ statement Denoted by  symbol (i.e., A  B)

2-48 The table displays the probabilities for each of the six outcomes when rolling a particular unfair die. Suppose that the die is rolled once. Let A be the event that the number rolled is less than 4, and let B be the event that the number rolled is odd. Find P(A  B). Outcome123456 Probability0.10.10.10.20.20.3 A. 0.5B. 0.2C. 0.3D. 0.7

2-49 Event B 1 B 2 Total A 1 P(AP(A 1  B 1 ) P(AP(A 1  B 2 ) P(AP(A 1 ) A 2 P(AP(A 2  B 1 ) P(AP(A 2  B 2 ) P(AP(A 2 ) P(BP(B 1 ) P(BP(B 2 )1 Event Probability Using Two–Way Table Joint ProbabilityMarginal (Simple) Probability Total

2-50 Thinking Challenge 1. P(A) = 2. P(D) = 3. P(C  B) = 4. P(A  D) = 5. P(B  D) = Event CDTotal A 426 B 134 5510 What’s the Probability?

2-51 Solution* The Probabilities Are: 1. P(A) = 6/10 2. P(D) = 5/10 3. P(C  B) = 1/10 4. P(A  D) = 9/10 5. P(B  D) = 3/10 Event CDTotal A 426 B 134 5510

2-52 Complementary Events Complement of Event A The event that A does not occur All events not in A Denote complement of A by A C S ACAC A

2-53 Rule of Complements The sum of the probabilities of complementary events equals 1: P(A) + P(A C ) = 1 S ACAC A

2-54 3.4 The Additive Rule and Mutually Exclusive Events

2-55 S  Mutually Exclusive Events Example Events  and are Mutually Exclusive Experiment: Draw 1 Card. Note Kind & Suit. Outcomes in Event Heart: 2, 3, 4,..., A Sample Space: 2, 2 , 2 ,..., A  Event Spade: 2 , 3 , 4 ,..., A 

2-56 Additive Rule 1.Used to get compound probabilities for union of events 2. P(A OR B) = P(A  B) = P(A) + P(B) – P(A  B) 3.For mutually exclusive events: P(A OR B) = P(A  B) = P(A) + P(B)

2-57 Let P(A)=0.25 and P(B C )=0.4. If P(A ∪ B)=0.85. Are the two events A, B mutually exclusive events? A. True B. False Thinking Challenge

2-58 Thinking Challenge 1. P(A  D) = 2. P(B  C) = Event CDTotal A 426 B 134 5510 Using the additive rule, what is the probability?

2-59 10 10 6 5 2 9 Solution* Using the additive rule, the probabilities are: P(A  D) = P(A) + P(D) – P(A  D) 1. 2. P(B  C) = P(B) + P(C) – P(B  C) 10 10 4 5 1 8 = + – =

2-60 Conditional Probability 1. Event probability given that another event occurred 2. Revise original sample space to account for new information Eliminates certain outcomes 3. P(A | B) = P(A and B) = P(A  B  P(B) P(B)

2-61 Using the table then the formula, what’s the probability? Thinking Challenge 1. P(A|D) = 2. P(C|B) = Event CDTotal A 426 B 134 5510

2-62 Solution* Using the formula, the probabilities are: P(D)=P(A  D)+P(B  D)=2/10+3/10 P(B)=P(B  D)+P(B  C)=3/10+1/10

2-63 Multiplicative Rule 1. Used to get compound probabilities for intersection of events 2.P(A and B) = P(A  B) = P(A)  P(B|A) = P(B)  P(A|B) 3.The key words both and and in the statement imply and intersection of two events, which in turn we should multiply probabilities to obtain the probability of interest.

2-64 Suppose that 23% of adults smoke cigarettes. Given a selected adult is a smoker, the probability that he/she has a lung condition before the age of 60 is 57%. What is the probability that a randomly selected person is a smoker and has a lung condition before the age of 60. A. 0.57 B. 0.99 C. 0.77 D. 0.13 Thinking Challenge

2-65 Multiplicative Rule Example Experiment: Draw 1 Card. Note Kind & Color. Color Type RedBlack Total Ace 224 Non-Ace 24 48 Total 26 52 P(Ace  Black) = P(Ace)∙P(Black | Ace)

2-66 Statistical Independence 1. Event occurrence does not affect probability of another event Toss 1 coin twice 2. Causality not implied 3.Tests for independence P(A | B) = P(A) P(B | A) = P(B) P(A  B) = P(A)  P(B)

2-67 Consider a regular deck of 52 cards with two black suits i.e. ♠(13),♣(13) and two red suits i.e. ♥(13) ♦(13). Given that you have a red card what is the probability that it is a queen? Also are the events getting a red card and getting a queen independent? A. 1/13, No (i.e. P(Queen)≠ P(Queen|Red)) B. 1/13, Yes (i.e. P(Queen)=P(Queen|Red) C. 2/13, Yes (i.e. P(Queen)≠ P(Queen|Red) D. 2/13, No (i.e. P(Queen)=P(Queen|Red) Thinking Challenge

2-68 Bayes’s Rule Given k mutually exclusive and exhaustive events B 1, B 1,... B k, such that P(B 1 ) + P(B 2 ) + … + P(B k ) = 1, and an observed event A, then Bayes’s rule is useful for finding one conditional probability when other conditional probabilities are already known.

2-69 Bayes’s Rule Example A company manufactures MP3 players at two factories. Factory I produces 60% of the MP3 players and Factory II produces 40%. Two percent of the MP3 players produced at Factory I are defective, while 1% of Factory II’s are defective. An MP3 player is selected at random and found to be defective. What is the probability it came from Factory I?

2-70 Bayes’s Rule Example Factory II Factory I 0.6 0.02 0.98 0.4 0.01 0.99 Defective Defective Good Good

2-71 Random Variable A random variable is a variable that assumes numerical values associated with the random outcomes of an experiment, where one (and only one) numerical value is assigned to each sample point.

2-72 Random Variable (cont.) There are two types of random variables: –Discrete random variables can take one of a finite number of distinct outcomes. Example: Number of credit hours –Continuous random variables can take any numeric value within a range of values. Example: Cost of books this term

2-73 Discrete Probability Distribution The probability distribution of a discrete random variable is a graph, table, or formula that specifies the probability associated with each possible value the random variable can assume.

2-74 Requirements for the Probability Distribution of a Discrete Random Variable x 1. p(x) ≥ 0 for all values of x  p(x) = 1 where the summation of p(x) is over all possible values of x.

2-75 a) It is not valid b) It is valid c) It is not valid d) It is not valid

2-76 a){HHH,HTT,THT,TTH,THH,HTH,HHT,TTT} {0,1,2,3} b) {1/8,3/8,3/8,1/8} d) P(x=2 or x=3)= P(x=2)+P(x=3)=3/8+1/8=1/2

2-77 1.Expected Value (Mean of probability distribution) Weighted average of all possible values  = E(x) =  x p(x) 2.Variance Weighted average of squared deviation about mean  2 = E[(x    (x    p(x)=  x 2 p(x)-  2 Summary Measures 3. Standard Deviation

2-78 Thinking challenge For the probability model given below, what is the value of P and E(X)? X2345 P(x)0.20.30.1 P A.0.1, 3.0 B.0.2, 4.7 C. 0.3, 1.7 D. 0.4, 3.7

2-79 Binomial Probability Characteristics of a Binomial Experiment 1.The experiment consists of n identical trials. 2.There are only two possible outcomes on each trial. We will denote one outcome by S (for success) and the other by F (for failure). 3.The probability of S remains the same from trial to trial. This probability is denoted by p, and the probability of F is denoted by q. Note that q = 1 – p. 4.The trials are independent. 5.The binomial random variable x is the number of S’s in n trials.

2-80 Binomial Probability Distribution p(x) = Probability of x ‘Successes’ p=Probability of a ‘Success’ on a single trial q=1 – p n=Number of trials x=Number of ‘Successes’ in n trials (x = 0, 1, 2,..., n) n – x=Number of failures in n trials

2-81 Binomial Probability Distribution Example Experiment: Toss 1 coin 5 times in a row. Note number of tails. What’s the probability of 3 tails? © 1984-1994 T/Maker Co.

2-82 Binomial Distribution Characteristics n = 5 p = 0.1 n = 5 p = 0.5 Mean Standard Deviation

2-83 Binomial Distribution Thinking Challenge You’re a telemarketer selling service contracts for Macy’s. You’ve sold 20 in your last 100 calls (p =.20). If you call 12 people tonight, what’s the probability of A. No sales? B. Exactly 2 sales? C. At most 2 sales? D. At least 2 sales?

2-84 Binomial Distribution Solution* n = 12, p =.20 E(X)=n*p=12*0.2=2.4  =(np(1-p)) 1/2 =(12*0.2*0.8) 1/2 =1.38 A. p(0) =.0687 B. p(2) =.2835 C. p(at most 2)= p(0) + p(1) + p(2) =.0687 +.2062 +.2835 =.5584 D. p(at least 2)= p(2) + p(3)...+ p(12) = 1 – [p(0) + p(1)] = 1 –.0687 –.2062 =.7251

2-85 By using TI-84: B.P(X = 2) = p(2) = P(X ≤ 2) – P(X ≤ 1) binomcdf(10,.20,2) - binomcdf(10,.20,2)

Similar presentations