1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Others Convenience Stratified Judgment Non-Probability Samples Probability Samples Simple Random Systematic Stratified Cluster Samples Sampling Techniques
2 Prof. Indrajit Mukherjee, School of Management, IIT Bombay TYPE OF SAMPLING SELECTION STRATEGY PURPOSE ConvenienceSelect cases based on their availability for the study. Saves time time, money and effort; but at the expense of information and credibility.
3 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Simple random sampling Sample methodResulting method The population is identified uniquely by number. Selection by random number Every number of the population has an equal chance of being selected into the sample
4 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Simulating From Continuous Uniform (] Random numbers Uniform [0,1] distribution Uniform [a, b] distribution 0 r 1 0 a a + r(b-a) b Shift a Stretch (b - a) b
5 Prof. Indrajit Mukherjee, School of Management, IIT Bombay How to use random number table to select a random sample corresponds to a number on the list of your population. In the example below, # 08 has been chosen as the starting point and the first student chosen is Carol Chan Step 3: Move to the next number, 42 and select the person corresponding to that number into the sample. #87 – Tan Teck Wah Step 4: Continue to the next number that qualifies and select that person into the sample. # Jerry Lewis, followed by #89, #53 and #19 Step 5: After you have selected the student # 19, go to the next line and choose #90. Continue in the same manner until the full sample is selected. If you encounter a number selected earlier (e.g., 90, 06 in this example) simply skip over it and choose the next number. Starting point: move right to the end of the row, then down to the next row row; move left to the end End, then down to the next row, and so on.
6 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Systematic sampling (contd.) “Example” Start with #4 and take every 5 th unit N=100 Want n=20 N/n=5 Select a random number from 1-5: Chose 4
7 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Stratified Random Sample: Stratified by Age years old (homogeneous within) (alike) years old (within homogeneous) (alike) years old (homogeneous within) (alike) Heterogeneous (different) between Heterogeneous (different) between
8 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Sample Spaces and Events Random Experiments Noise variables affect the transformation of inputs to outputs. Noise variables Controlled variables Input Output System
9 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Example Rotation speed Traverse speed Tool type Tool sharpness Shaft material Shaft length Material removal per cutPart cleanliness Coolant flow Operator Material variation Ambient temperature Coolant age Machining a shaft on a lathe Outputs (Y’s) Diameter Taper Surface finish
10 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Four Types of Probability MarginalUnionJointConditional The probability of X Occurring P( X ) The probability of X or Y occurring The probability of X and Y occurring The probability of X occurring given that Y has occurred P(X|Y) X XYXY
11 Prof. Indrajit Mukherjee, School of Management, IIT Bombay P(A and B) (Venn Diagram) P(A) P(B) P(A and B)
12 Prof. Indrajit Mukherjee, School of Management, IIT Bombay P(A or B)
13 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Sample Spaces and Events Venn Diagrams
14 Prof. Indrajit Mukherjee, School of Management, IIT Bombay E4E4 E1E1 E2E2 E3E3 Venn diagram of four mutually exclusive events
15 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Collectively Exhaustive Events Events are said to be collectively exhaustive if the list of outcomes includes every possible outcome: heads and tails as possible outcomes of coin flip
16 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Example 3 Draw Mutually Collectively Exclusive Exhaustive Draw a space and a club Yes Yes Draw a face card and a Yes Yes number card Draw an ace and a 3 Yes No Draw a club and a nonclub Yes Yes Draw a 5 and a diamond No No Draw a red card and a No No diamond
17 Prof. Indrajit Mukherjee, School of Management, IIT Bombay The following circuit operates only if there is a path of functional devices from left to right. The probability that each device function is shown on the graph. Assume that devices fail independently, what is the probability the circuit operates? Let T and B denote the events that the top and bottom devices operate, Respectively. There is a path if at least on device operates. The probability that the circuit operates is P(T or B) =1-[P(T or B)’]=1-P(T’ and B’) A simple formula for the solution can be derived from the complements T and B’. From the independence assumption. P(T’ and B’)=P(T’) P(B’)=(1-0.95) 2 = P(T or B)= ab
18 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Probability(D|F) P(D|F) = P(DF)/P(F) / P(D)P(DF)P(F)
19 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Random Variables (Numeric) ExperimentOutcomeRandom VariableRange of Random Variable Stock 50 Xmas trees Number of trees sold X = number of trees sold 0,1,2,, 50 Inspect 600 items Number acceptable Y = number acceptable 0,1,2,…,600 Send out 5,000 sales letters Number of People responding Z = number of people responding 0,1,2,…, 5,000 Build an apartment building %completed after 4 months R = %completed after 4 months 0≤R ≤ 100 Test the lifetime of a light bulb (minutes) Time bulb lasts - up to 80,000 minutes S = time bulb burns 0 ≤ S ≤ 80,000
20 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Comressive strength (in psi) of 80 aluminum-lithium alloy specimens
21 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Frequency Distributions and Histograms Histogram of compressive strength for 80 aluminum-lithium alloy specimens Frequency compressive strength (psi)
22 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Histograms – Useful for large data sets Group values of the variable into bins, then count the number of observations that fall into each bin Plot frequency (or relative frequency) versus the values of the variable
23 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Minitab histogram for the metal layer thickness data in table Metal thickness Frequency
24 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Histogram Example Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 No gaps between bars, since continuous Data Histogram Class midpoints Frequency
25 Prof. Indrajit Mukherjee, School of Management, IIT Bombay How Many Class Intervals? Many (Narrow class intervals) may yield a very jagged distribution with gaps from empty classes Can give a poor indication of how frequency varies across classes Few (Wide class intervals) may compress variation too much and yield a blocky distribution can obscure important patterns of variation.
26 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Calculation of Grouped Mean Class Interval Frequency Class Midpoint fM 20-under under under under under under
27 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Mode of Grouped Data Midpoint of the modal class Modal class has the greatest frequency Class Interval Frequency 20-under under under under under under 80 1
28 Prof. Indrajit Mukherjee, School of Management, IIT Bombay
29 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Frequency Distribution: Discrete Data Discrete data: possible values are countable Example: An advertiser asks 200 customers how many days per week they read the daily newspaper. Number of days readFrequency total200
30 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Relative Frequency Relative Frequency: What proportion is in each 22% of the people in the sample report that they read the Newspaper days per week Number of days readFrequency Relative frequency total
31 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Relative Frequency Plot and Probability Distributions Histogram approximates a probability density function. F(x) X
32 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Interpretations of Probability Relative frequency of corrupted pulses sent over a communications channel. Relative frequency of corrupted pulse=2/10 Corrupted pulse Time Voltage
33 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Interpretations of Probability P(E)=30(0.01)=0.30 Probability of the event E is the sum of the probabilities of the outcomes in E Diodes E S
34 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Random Variables QuestionRandom Variable xType Family size x = Number of dependents in family reported on tax return Discrete Distance from home to store x = Distance in miles from home to the store site Continuous Own dog or cat x = 1 if own no pet; = 2 if own dog(s) only; = 3 if own cat(s) only; = 4 if own dog(s) and cat(s) Discrete
35 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Using past data on TV sales, … a tabular representation of the probability distribution for TV sales was developed. Unit sold Number of days read total200 xf(x) total1.00
36 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Graphical Representation of the Probability Distribution Values of random variable x (TV sales) Probability
37 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Uniform Probability Distribution Normal Probability Distribution Exponential Probability Distribution Uniform Normal Exponential F(x) XXX
38 Prof. Indrajit Mukherjee, School of Management, IIT Bombay x 1 x 2 x 3 x 4 x 5 F(x) X X p(x 3 ) p(x 4 ) p(x 5 )p(x 1 ) p(x 2 ) a b Sometimes called a probability mass function Sometimes called a probability density function Probability distributions (a)Discrete case (b)continuous case
39 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Throwing a Dice /6 Distribution of X P(X) X
40 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Example 2a Outcome Probability of Roll = 5 Die 1 Die / / / /36 Rolling two dice results in a total of five spots showing. There are a total of 36possible outcomes
41 Prof. Indrajit Mukherjee, School of Management, IIT Bombay sampleSamplesample 1,113,125,13 1,21.53,22.55,23.5 1,323,335,34 1,42.53,43.55,44.5 1,533,545,55 1,63.53,64.55,65.5 2,11.54,12.56,13.5 2,224,236,24 2,32.54,33.56,34.5 2,434,446,45 2,53.54,54.56,55.5 2,644,656,66 All Samples of subgroup size 2 from a Population
42 Prof. Indrajit Mukherjee, School of Management, IIT Bombay 11/ /36 23/ /36 35/ /36 45/ /36 53/ /36 61/36 Sampling Distribution of
43 Prof. Indrajit Mukherjee, School of Management, IIT Bombay (b) Sampling distribution of 6/36 4/36 2/36
44 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Sampling Distribution of for n = 5
45 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Sampling Distributions of Means Figure Distributions of average scores from throwing dice
46 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Sampling Distribution Becomes Almost Normal Regardless of Shape of Population As Sample Size Gets Large Enough Central Limit Theorem
47 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Probability Distributions OutcomeXNumber respondingp(X) SA A N D SD1500.1
48 Prof. Indrajit Mukherjee, School of Management, IIT Bombay XP(X=x)X
49 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Probability Distributions and Probability Density Functions Density function of a loading on a long, thin beam. X Loading
50 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Probability Distributions and Probability Density Functions Probability determined from the area under f(x). P(a<X<b) X F(x) ab
51 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Continuous Uniform Random Variable 1/(b-a) X F(x) ab Continuous uniform probability density function.
52 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Uniform Distribution or rectangular probability distribution 1/(b-a) X F(x) ab area = width x height = (b – a) x
53 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Example The amount of gasoline sold daily at a service station is uniformly distributed with a minimum of 2,000 gallons and a maximum of 5,000 gallons. X F(x) 2,0005,000 Find the probability that daily sales will fall between and 3,000 gallons. 2,500 Algebraically: what is P(2,500 ≤ X ≤ 3,000) ?
54 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Example X F(x) 2,0005,000 “there is about a 17% chance that between 2,500 and 3,000 gallons of gas will be sold on a given day”
55 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Cumulative Distribution Functions X F(x) Cumulative Distribution Functions
56 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Probability Distributions and Probability Density Functions X F(x) Probability density function
57 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Normal Probability Distribution Characteristics The distribution is symmetric, and is bell-shaped. x
58 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Normal Probability Distribution Characteristics Probabilities for the normal random variable are given by areas under the curve. The total area under the curve is 1 (.5 to the left of the mean and.5 to the right). x 0.5
59 Prof. Indrajit Mukherjee, School of Management, IIT Bombay The mean is not necessarily the 50th percentile of the distribution (that’s the median) The mean is not necessarily the most likely value of the random variable (that’s the mode) Two probability distributions with same mean but different standard deviations µ MedianMode The mean of a distribution Two probability distributions with different means µ µ Mode µ=20 µ=10 σ=2 σ=4
60 Prof. Indrajit Mukherjee, School of Management, IIT Bombay µ-1σµ+1σµ+2σµ+3σµ-2σµ-3σ µ 99.73% 68.26% 95.46% µ σ2σ2 F(x) x Areas under normal distribution The normal distribution
61 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Normal Distribution Standardizing a normal random variable.
62 Prof. Indrajit Mukherjee, School of Management, IIT Bombay A normal distribution whose mean is zero and standard deviation is one is called the standard normal distribution. σ=1 µ=0 As we shall see shortly, any normal distribution can be converted to a standard normal distribution with simple algebra. This makes calculations much easier. Standard Normal Distribution…
63 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Normal Distribution Example z
64 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Areas under Standardized Normal Distribution
65 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Areas under Standardized Normal Distribution
66 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Using Excel to Compute Standard Normal Probabilities Formula Worksheet AB 1Probabilities: standard normal distribution 2P (z < 1 00)=NORMSDIST(1) 3P (0.00 < z < 1.00)=NORMSDIST(1)-NORMSDIST(0) 4P (0.00 < z < 1.25)=NORMSDIST(1.25)-NORMSDIST(0) 5P (-1.00 < z < 1.00)=NORMSDIST(1)-NORMSDIST(-1) 6P (z > 1.58)=1-NORMSDIST(1.58) 7P (z < -0.50)=NORMSDIST(-0.5) 8