The Scientific Method Probability and Inferential Statistics.

The Scientific Method Probability and Inferential Statistics

Scientific investigations sample values of a variable to make inferences (or predictions) about all of its possible values in the population BUT! There is always some doubt as to whether observed values = population values eg Jersey cow serum iron concentrations Inferential statistics quantify the doubt – ► What are the chances of conclusions based on a sample of the population holding true for the population as a whole ? ► are the conclusions safe ? – will the prediction happen in most observed situations? Probability is defined as a relative frequency or proportion – the chance of something happening out of a defined number of opportunities Lecture 6: Probaility and Inferential Statistics Probability and Inferential Statistics

subjectively as a % expectation of an event – a cow has a 60% chance of calving tonight (based on experience, but subject to individual opinions) a priori probability – based on the theoretical model defining the set of all probabilities of an outcome. eg when a coin is tossed, the probability of obtaining a head is ½ or 0.5 defined probability – the proportion of times an event will occur in a very large number of trials (or experiments) performed under similar conditions. e.g. the proportion of times a guinea pig will have a litter of greater than three, based upon the observed frequency of this event All of these approaches are related mathematically Probabilities can be expressed as a percentage (23%), a fraction / proportion (23/100) or a decimal (0.23) as parts of a whole (= parts of a unitised number of opportunities) Lecture 6: Probaility and Inferential Statistics For example:

Addition rule – when two events are mutually exclusive (they can’t occur at the same time) the probabilities of either of them occurring is the sum of the probabilities of each event eg 1/5 + 1/5 = 2/5 or 0.4 for two particular biscuits out of 5 types Multiplication rule - when two events are independent, the probability of both events occurring = the product of their individual probabilities e.g. a Friesian cow inseminated on a particular day has a probability of calving 278 days later (the mean gestation period) of 0.5 (she either calves or she doesn’t!) If two Friesian cows are inseminated on the same day, then the probability of both of them calving on the same day 278 days later is (0.5 x 0.5 ) = 0.25 Probability distributions can derive from discrete or continuous data a discrete random variable with only two possible values (e.g. male/female) is called a binary variable Lecture 6: Probaility and Inferential Statistics Two rules govern probabilities

eg number of spots on ladybirds in a sample The binomial distribution portrays the frequency distribution of data relating to an “all or none” event – whether an animal displays or doesn’t display a characteristic eg pregnant / not pregnant, number of spots on ladybirds (either 3,5,7,9,11,15,18, 21 etc!) For a continuous variable, the probability that its value lies within a particular interval is given by the relevant area under the curve of the probability density function The NORMAL ( or Gaussian ) DISTRIBUTION is a theoretical distribution of a continuous random variable (x) whose properties can be described mathematically by the mean (  ) and standard deviation (σ) Lecture 6: Probaility and Inferential Statistics

the proportion of the values of x lying between + and – 1x (times!), 1.96x and 2.58x the standard deviation on either side of the mean. It means 100% of the data values are included within 3 sd units either side of the mean In a perfectly symmetrical normal distribution, MEAN, MEDIAN and MODE have the same value Normal distributions with the same value of the standard deviation (σ) but different values of the mean (  ) Lecture 6: Probaility and Inferential Statistics

It is possible to make predictions about the likelihood of the mean value of a variable differing from another mean value – whether the difference is likely or unlikely to be due to chance alone This is the basis of significance testing - if the distribution of observed values approximates to the normal distribution, it becomes possible to compare means of variables with the theoretical distribution and estimate whether their observed differences are significantly different from the expected values of each variable if they are truly normally distributed eg Student’s t Test We carry out an experiment on guinea pigs to test the hypothesis that dietary lipid sources rich in ω3 polyunsaturated fatty acids improve coat condition We compare the breaking strength of hairs from two groups of 10 guinea pigs fed a normal mix compared with a diet supplemented with cod liver oil, recording the max. weight their hair will support as tensile strength in g. We want to decide whether the mean strength of hairs from the control and experimental groups differ significantly at the end of the trial Lecture 6: Probaility and Inferential Statistics

The steps for doing this manually are best set out in a table Lecture 6: Probaility and Inferential Statistics First we must calculate the sample mean, variance and standard deviation for each data set (control and test) If the data for the control mean are referred to as a and the test mean as b, then the t statistic is calculated as: Calculating the t statistic

n a = 10 X b = 7.48X a = 6.34 ∑ = 6.69∑ = 1.77∑ = 74.8∑ = 63.4 0.3840.620.0040.068.16.4 1.166-1.080.1300.366.46.7 0.462-0.680.548-0.746.85.6 0.078-0.280.2120.467.26.8 0.960-0.980.116-0.346.56.0 1.7421.320.002-0.048.86.3 0.608-0.780.058-0.246.76.1 0.2700.520.4360.668.07.0 0.8460.920.193-0.448.45.9 0.1760.420.0670.267.96.6 ( X b – X b ) 2 X b – X b ( X a – X a ) 2 X a – X a Hair Tensile Strength GP Test Group b (g) Hair Tensile Strength GP Control Group a (g) n b = 10 Lecture 6: Probaility and Inferential Statistics Calculating the t statistic

We then compare our calculated value of t with those in the table of critical values for the value of t Lecture 6: Probaility and Inferential Statistics Calculating the variance: Calculating the t statistic and the standard deviation and finally the t statistic! We can ignore the –ve sign! Degrees of Freedom of the data set

These are the significance levels for the t statistic at 10%, 5%, 1% and 0.1%, (from left to right) If our value for t (2.78) exceeds any of the tabulated values of t for 18 df, which it does for p = 0.05, but not for p = 0.01, we can say “the means are different at the 5% level of significance and we can reject H 0 ” (the null hypothesis of no difference between the two treatments) The confidence level is simply 100 – (significance level) So, alternatively, we could say: “ we can be 95% confident that there is a significant difference between the two means” Lecture 6: Probaility and Inferential Statistics Significance and confidence

The Scientific Method Probability and Inferential Statistics.

Similar presentations

Presentation on theme: "The Scientific Method Probability and Inferential Statistics."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Scientific Method Probability and Inferential Statistics.

Similar presentations

Presentation on theme: "The Scientific Method Probability and Inferential Statistics."— Presentation transcript:

Similar presentations

About project

Feedback