Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture-6 Models for Data 1. Discrete Variables Engr. Dr. Attaullah Shah.

Similar presentations


Presentation on theme: "Lecture-6 Models for Data 1. Discrete Variables Engr. Dr. Attaullah Shah."— Presentation transcript:

1 Lecture-6 Models for Data 1. Discrete Variables Engr. Dr. Attaullah Shah

2 Statistical Models In Statistics the distribution of random variable in a data is expressed by one or more equation. A simple model for the measurement X made by an instrument: X = θ + ε θ is the true value of what is being measured, and ε is a measurement error. A plausible model is selected to ensure that the data is in agreement with the model. For this purpose both the type of model and distribution of r.v is important to know. There are a number of distribution variables used for data, such as for Discrete Statistical Distributions, we use The Hyper- geometric Distribution, The Binomial Distribution, The Poisson Distribution and for continuous statistical distribution we use Normal or Gaussian Distribution, Exponential Distribution, Lognormal Distribution etc.

3 Discrete vs. Continuous Random Variables DISCRETECONTINUOUS Values that can be counted and ordered Values that cannot be counted Gap between consecutive values On continuous spectrum Examples: 1) Insurance claims filed in one day 2) Cars sold in one month 3) Employees who call in sick on a day Examples: 1) Time to check out a customer 2) Weight of an outgoing shipment 3) Distance traveled by a truck in a single day 4) Price of a gallon of gas Measure with a specific amount of precision

4 Discrete Statistical Distribution discrete distribution is one for which the random variable being considered can only take on certain specific values, rather than any value within some range. The possible values are 0, 1, 2, 3, and so on. It is conventional to denote a random variable by a capital X and a particular observed value by a lowercase x. A discrete distribution is then defined by a list of the possible values x 1, x 2, x 3, …, for X, and the probabilities P(x 1 ),P(x 2 ), P(x 3 ), …, for these values. P(x 1 ) + P(x 2 ) + P(x 3 ) + … = 1 Often there is a specific equation for the probabilities defined by a probability function P(x) = Prob(X = x). where P(x) is some function of x. The mean of a random variable is sometimes called the expected value, and is usually denoted either by μ or E(X). E(X) =ΣxiP(xi ) = x 1 P(x 1 ), x 2 P(x 2 ), x 3 P(x 3 )+……

5 The variance of a discrete distribution is equal to the sample variance that would be obtained for a very large sample from the distribution. The square root of the variance, σ, is the standard deviation of the distribution Example: Two thrown are thrown at a time, the possible outcomes are ( HH, HT, TH, TT) If a random variable X=No of heads p(x=0) = 1/4, p(x=1)=2/4=1/2, p(x=2)=1/4 E(x)=xp(x)= 0.(1/4)+1.(1/2)+2(1/4) = ½+ ½ =1

6 The Hypergeometric Distribution when a random sample of size n is taken from a population of N units. If the population contains R units with a certain characteristic, then the probability that the sample will contain exactly x units with the characteristic is: P(x) = RC x N−RC n−x /NC n, for x = 0, 1, …, Min(n,R). where aC b denotes the number of combinations of a objects taken b at a time. The mean and variance are   μ = nR/N and   σ 2 = [nR (N − R)(N − n)]/[N 2 (N − 1)].

7 Hypergeometric Probability Distribution Where N = total number of elements in the population r = number of success in the population N-r = number of failures in the population n = number of trials (sample size) x= number of successes in trial n-x= number of failures in n trials n-x= number of failures in n trials

8 Hypergeometric Probability Distribution Example Suppose we select 5 cards from an ordinary deck of playing cards. What is the probability of obtaining 2 or fewer hearts? Solution: N = 52; since there are 52 cards in a deck. r = 13; since there are 13 hearts in a deck. n = 5; since we randomly select 5 cards from the deck. x = 0 to 2; since our selection includes 0, 1, or 2 hearts. We plug these values into the hyper geometric formula as follows: p(x=0-2)=p(x=0)+p(x=1)+p(x=2)=0.2215+0.2743+0.4114= 9.072 = 90.72%

9 (a) (a) Hypergeometric Distributions Can U verify these curves?

10 The Binomial Distribution Suppose that it is possible to carry out a certain type of trial and that, when this is done, the probability of observing a positive result is always p for each trial, irrespective of the outcome of any other trial. Then if n trials are carried out, the probability of observing exactly x positive results is given by the binomial distribution: P(x) = nC x p x (1 − p) n−x, for x = 0, 1, 2, …, n For example when a single dice is thrown, the probability of getting 1 is 1/6. If p=1/6 then 1-p = 5/6. If the dice is thrown n=50 times, what is the probability of x=1,2 and 3.

11 Case Study Inspecting water samples The number X of turbidity pollution in ppm has approximately the binomial distribution with n=10 and p=0.1. Find the probability of getting 1 or 2 ppm in a sample of 10.

12 If X has the binomial distribution with n observations and probability p of success on each observation, then the mean and standard deviation of X are If X has the binomial distribution with n observations and probability p of success on each observation, then the mean and standard deviation of X are Mean and Standard Deviation

13 Case Study Inspecting water samples number X of turbidity pollution in ppm has approximately the binomial distribution with n=10 and p=0.1. Find the mean and standard deviation of this distribution. µ = np = (10)(0.1) = 1   the probability of each sample being bad is one tenth; so we expect (on average) to get 1 bad one out of the 10 sampled

14 (b) Binomial Distributions

15 Binomial Probability Distribution A fixed number of observations (trials), n A fixed number of observations (trials), n  e.g., 15 tosses of a coin; 20 patients; 1000 people surveyed A binary random variable A binary random variable  e.g., head or tail in each toss of a coin; defective or not defective light bulb  Generally called “success” and “failure”  Probability of success is p, probability of failure is 1 – p Constant probability for each observation Constant probability for each observation  e.g., Probability of getting a tail is the same each time we toss the coin

16 Binomial distribution Intuitive explanation of binomial distribution formula: Take the example of 5 coin tosses. What’s the probability that you flip exactly 3 heads in 5 coin tosses? Take the example of 5 coin tosses. What’s the probability that you flip exactly 3 heads in 5 coin tosses?

17 Example 2 As visitors exit the park on April 20, you ask a representative random sample of 6 visitors if they feel pollen allergy in the park. If the true percentage of visitors who say yes is 55.1%, what is the probability that exactly 2 of them have allergy and 4 of them did not have the allergy? As visitors exit the park on April 20, you ask a representative random sample of 6 visitors if they feel pollen allergy in the park. If the true percentage of visitors who say yes is 55.1%, what is the probability that exactly 2 of them have allergy and 4 of them did not have the allergy?

18 Binomial distribution: example If I toss a coin 20 times, what’s the probability of getting of getting 2 or less heads? If I toss a coin 20 times, what’s the probability of getting of getting 2 or less heads?

19 The Poisson Distribution One derivation of the Poisson distribution is as the limiting form of the binomial distribution as n tends to infinity and p tends to zero, with the mean μ = np remaining constant. The probability function is P(x) = μ x exp(−μ) /x!, for x = 0, 1, 2, …, n.  represents average number of occurrences in an interval.  μ represents average number of occurrences in an interval.  x represents the actual number of occurrences  e is approximately 2.71828 The mean and variance are both equal to μ.   In terms of events occurring in time, the type of situation where a Poisson distribution might occur is for counts of the number of occurrences of minor oil leakages in a region per month, or the number of cases per year of a rare disease in the same region. For events occurring in space, a Poisson distribution might occur for the number of rare plants found in randomly selected meter-square quadrats taken from a large area

20 The Poisson Distribution

21 Example For example, if new cases of avian flu in Asia are occurring at a rate of about 2 per month, then these are the probabilities that: 0,1, 2, 3, 4, cases will occur in Asia in the next month: For example, if new cases of avian flu in Asia are occurring at a rate of about 2 per month, then these are the probabilities that: 0,1, 2, 3, 4, cases will occur in Asia in the next month:

22 Poisson Probability table XP(X) 0=.135 1=.27 2 3=.18 4=.09 5 ……

23 Example: Poisson distribution Suppose that a rare disease has an incidence of 1 in 1000 person- years. Assuming that members of the population are affected independently, find the probability of k cases in a population of 10,000 (followed over 1 year) for k=0,1,2. The expected value (mean) = =np=.001*10,000 = 10 10 new cases expected in this population per year 

24 Home Assignment: Hyper geometric Distribution: Hyper geometric Distribution:   7 students were selected from a total of class 45 students of Statistical Environment containing 15 PhD students. What is the probability of obtaining 3 or fewer PhD students? Binomial Distribution: Binomial Distribution:  What is the probability of getting 1 if a fair dice is thrown 100 times? Poisson Distribution: Poisson Distribution:  A colony of 200 people were checked for pollen allergy and it was found that only 3 persons had the diseases. What is the probability that 10 people in population of 1000 people will have the problem?


Download ppt "Lecture-6 Models for Data 1. Discrete Variables Engr. Dr. Attaullah Shah."

Similar presentations


Ads by Google