Presentation on theme: "Lecture 8 Probabilities and distributions Probability is the quotient of the number of desired events k through the total number of events n. If it is."— Presentation transcript:
Lecture 8 Probabilities and distributions Probability is the quotient of the number of desired events k through the total number of events n. If it is impossible to count k and n we might apply the stochastic definition of probability. The probability of an event j is approximately the frequency of j during n observations.
What is the probability to win in Duży Lotek? The number of desired events is 1. The number of possible events comes from the number of combinations of 6 numbers out of 49. We need the number of combinations of k events out of a total of N events Bernoulli distribution
What is the probability to win in Duży Lotek? Wrong! Hypergeometric distribution P = 0.0186 N K=n+k n We need the probability that of a sample of K elements out of a sample universe of N exactly n have a desired probability and k not.
In Multi Lotek 20 numbers are taken out of a total of 80. What is the probability that you have exactly 10 numbers correct? N = 80 K = 20 n = 10 k = 10
Assessing the number of infected persons Assessing total population size Capture – recapture methods The frequency of marked animals should equal the frequency wothin the total population Assumption: Closed population Random catches Random dispersal Marked animals do not differ in behaviour N real = 38 We take a sample of animals/plants and mark them We take a second sample and count the number of marked individuals
The two sample case You take two samples and count the number of infected persons in the first sample m 1, in the second sample m 2 and the number of infected persons noted in both samples k. How many persons have a certain infectuous desease?
m species l species k species In ecology we often have the problem to compare the species composition of two habitats. The species overlap is measured by the Soerensen distance metric. We do not know whether S is large or small. To assess the expectation we construct a null model. Both habitats contain species of a common species pool. If the pool size n is known we can estimate how many joint species k contain two random samples of size m and l out of n. n species Common species pool Habitat A Habitat B The expected number of joint species. Mathematical expectation The probability to get exactly k joint species. Probability distribution.
Ground beetle species of two poplar plantations and two adjacent wheet fields near Torun (Ulrich et al. 2004, Annales Zool. Fenn.) Pool size 90 to 110 species. There are much more species in common than expected just by chance. The ecological interpretation is that ground beetles colonize fields and adjacent seminatural habitats in a similar manner. Ground beetles do not colonize according to ecological requirements (niches) but according to spatial neighborhood.
Bayesian inference and maximum likelihood (Idż na całość)
The law of dependent propability Theorem of Bayes Thomas Bayes (1702-1761) Abraham de Moivre (1667-1754)
Total probability Idż na całość Assume we choose gate 1 (G1) at the first choice. We are looking for the probability p(G1|M3) that the car is behind gate 1 if we know that the moderator opened gate 3 (M3). A B3 B2 B1 N P(B1) P(B3) P(B2) P(A|B1) P(A|B3) P(A|B2)
Calopteryx spelendens We study the occurrence of the damselfly Calopteryx splendens at small rivers. We know from the literature that C. splendens occurs at about 10% of all rivers. Occurrence depends on water quality. Suppose we have five quality classes that occur in 10% (class I), 15% (class II), 27% (class III), 43% (class IV), and 5% (class V) of all rivers. The probability to find Calopteryx in these five classes is 1% (class I), 7% (class II), 14% (class III), 31% (class IV), and 47% (class V). To which class belongs probably a river if we find Calopteryx? p(class II|A) = 0.051, p(class III|A) = 0.183, p(class IV|A) = 0.647, p(class V|A) = 0.114 Indicator values
Bayes and forensic False positive fallacy Error of the prosecutor Let’s take a standard DNA test for identifying persons. The test has a precision of more than 99%. What is the probability that we identify the wrong person? The forensic version of Bayes theorem
The error of the advocate In the process against the basketball star E. O. Simpson, one of his advocates (a Harvard professor) argued that Simpson sometimes has beaten his wife. However, only very few man who beat their wives later murder them (about 0.1%).
Maximum likelihoods Suppose you studied 50 patients in a clinical trial and detected at 30 of them the presence of a certain bacterial disease. What is the most probable frequency of this disease in the population? We look for the maximum value of the likelihood function
Home work and literature Refresh: Probability Permutations, variations, combinations Bernoulli event Pascal triangle, binomial coefficients Dependent probability Independent probability Derivative, integral of power functions Prepare to the next lecture: Arithmetic, geometric, harmonic mean Cauchy inequality Statistical distribution Probability distribution Moments of distributions Error law of Gauß Literature: http://www.brixtonhealth.com/CRCaseFinding.pdf