Probability and Induction

Presentation on theme: "Probability and Induction"— Presentation transcript:

Probability and Induction

Probability Probability is a measure of the chances that something will happen. A fair coin has a 50% probability of landing heads. There is a 20% probability that it will rain tomorrow. There is a 37.6% chance that Horacio Cartes will win the presidency of Paraguay.

Frequency Probability is not the frequency that something happens. Frequency of coin landing heads = [# of heads ÷ # of total flips]

Frequency The frequency of heads over a large number of coin flips will be close to the probability of a coin landing heads. But: Frequency over a small number of flips may be very different from probability. Some events only happen once, like the 2013 election in Paraguay.

What Is Probability? People disagree on this question. One idea is that it’s a type of uncertainty. A 50% probability of a coin landing heads is where I’ll only bet \$10 that it will land heads if the bet pays \$10 or more if I win. A 20% probability of it raining tomorrow is where I’ll only bet \$10 that it will rain if the bet pays \$40 or more.

Formalizing Probability
We can use Pr(-) and our earlier logic to formalize probability claims: Pr(A) = the probability of A happening Pr(A & B) = probability of A and B Pr(A v B) = probability of A or B Pr(~B) = probability of B not happening

Probability Axioms Probability theory is one of the simplest mathematical theories there is. There are only three basic laws (axioms): For all A, B: 1 ≥ Pr(A) ≥ 0 Pr(A v ~A) = 1 Pr (A v B) = Pr(A) + Pr(B) – Pr(A & B)

Conjunction I won’t ask you to do any math, but you do need to remember a couple of facts for the final: Pr(A & B) ≤ Pr(A) Pr(A & B) ≤ Pr(B) Note: “≤” not “<“

Conjunction Fallacy We learned this when we learned about the conjunction fallacy. (A & B) is always less probable or equally probable than A, and always less probable or equally probable than B.

Question #2 2. Which of the following is most likely to happen? a. There will not be a final exam in this class. b. There will not be a final exam in this class, because the instructor has to leave the country. c. Lingnan University closes and there will not be a final exam in this class. d. There is not enough information to answer this question.

Question #2 2. Which of the following is most likely to happen? a. There will not be a final exam in this class. b. There will not be a final exam in this class, because the instructor has to leave the country. c. Lingnan University closes and there will not be a final exam in this class. d. There is not enough information to answer this question.

Special Cases Normally, (A & B) is less probable than A, as in the question on the exam. But sometimes they are equal, for instance when (A → B) or when B is always true. The probability that a thing x is a dog AND an animal is the same as the probability that it is a dog, because “x is a dog → x is an animal.”

For the Final So remember for the final: Pr(A & B) ≤ Pr(A) Pr(A & B) ≤ Pr(B) There is more than one question that tests these facts!

induction

Kinds of Inference C.S. Pierce ( ), an American pragmatist philosopher, was the first to divide inferences into three types: Deductive Abductive Inductive

Jars of Balls

Jars of Balls Imagine that we are reasoning about a jar full of a mix of red and black balls. For any ball it can have one of three features (or their opposites): J = it is in the jar S = it is part of a random sample of balls taken from the jar R = it is red

Deductive Arguments A deductive argument (also known as a logically valid argument) is an argument such that if the premises are true, then the conclusion must be true; the premises can’t all be true while the conclusion is false; if the conclusion is false, at least one of the premises must also be false.

Deduction Example Everything in the jar is red; this sample is taken from the jar; so everything in this sample is red. [Rule] All J’s are R’s. [Case] All S’s are J’s. Therefore, [Result] All S’s are R’s.

Notes on Terminology We did a fair amount of deductive logic so far. The Final contains language like “construct a natural deduction from the premises ________ to the conclusion ________.” This just means prove. All the proof questions are optional.

Abductive Arguments An abductive argument, unlike a deductive argument, is ‘ampliative’: the conclusion goes beyond what is contained in the premises.

Abductive Arguments Abductive arguments are also called ‘inferences to the best explanation.’ We observe some phenomenon, think about all the different ways it could have been produced, and conclude that it was produced in the way that is most plausible, best fits with our other theories, etc.

Abduction Example Everything in the jar is red; everything in the sample is red; so the sample came from the jar. [Rule] All J’s are R’s. [Result] All S’s are R’s. Therefore, [Case] All S’s are J’s.

Generalized Abduction
X% of what’s in the jar is red; Y% of what’s in the sample is red; so the sample came from the jar. [Rule] X% of J’s are R’s. [Result] Y% of S’s are R’s. Therefore, [Case] All S’s are J’s.

Inductive Arguments Inductive arguments are also ‘ampliative’ in that the truth of their premises does not guarantee the truth of their conclusions.

Inductive Arguments Inductive arguments reason from what’s true of a sample, to what’s true of the population as a whole. Polling is an example of induction, as are inferences from past experience to future experience (since what’s past is only a sample of what happens).

Induction Example This sample is taken from the jar; everything in the sample is red; so everything in the jar is red. [Case] All S’s are J’s. [Result] All S’s are R’s. Therefore, [Rule] All J’s are R’s.

Generalized Induction
This sample is taken from the jar; Y% of what’s in the sample is red; so X% of what’s in the jar is red. [Case] All S’s are J’s. [Result] Y% of S’s are R’s. Therefore, [Rule] X% of J’s are R’s.

Enumerative Induction
A common form induction takes is what is known as enumerative induction: a1 is F and G a2 is F and G … an is F and G Therefore, everything that is F is G

Example of Enumerative Induction
Fruit #1 is a durian that smells bad. Fruit #2 is a durian that smells bad. … Fruit #563 is a durian that smells bad. Fruit #564 is a durian that smells bad. Therefore, All durians smell bad.

Need Not Infer to a Rule Fruit #1 is a durian that smells bad. Fruit #2 is a durian that smells bad. … Fruit #563 is a durian that smells bad. Fruit #564 is a durian that smells bad. Therefore, Fruit #565, if it’s a durian, will smell bad.

Strength and Weakness We often do not use the word “valid” to describe inductive arguments. An argument is deductively true when IF the premises are true, THEN the conclusion is true. Inductive arguments are not deductively valid.

Strength and Weakness Instead, we can say that some inductive arguments are strong and others are weak. An inductive argument A1, A2, …; therefore C is strong if the probability of its conclusion, given its premises, is very high: Pr(C / A1 & A2 & …) >> 0.

Strength Strength: assuming the premises are true, the probability of the conclusion is very high. This does not mean that any argument with a high-probability conclusion is inductively strong. Sometimes Pr(C) >> Pr(C / A)

Example For example, it’s highly probable that I will not win Mark 6. But this is not a strong deductive argument: I correctly guessed the first 5 numbers announced in Mark 6; therefore, I will not win Mark 6.

A Note on Terminology On the final, you will encounter both the phrases “inductively strong” and “inductively valid.” These phrases mean the same thing. It’s a personal preference to use one or the other.

“Generalized Deduction”
Another form that induction can take is not often discussed by philosophers: it’s the ‘generalized’ form of the deductive argument: [Rule] X% of J’s are R’s. [Case] All S’s are J’s. Therefore, [Result] Y% of S’s are R’s.

Example 90% of French people like to eat snails. Pierre is a French person. Therefore, Pierre likes to eat snails. Maybe this isn’t induction? It’s definitely neither deduction nor abduction.

samples

Sample In statistics, the people who we are studying are called the sample. (Or if I’m studying the outcomes of coin flips, my sample is the coin flips that I’ve looked at. Or if I’m studying penguins, it’s the penguins I’ve studied.)

Induction on Samples Our goal is to use induction to infer from the statistical properties of the sample, to the statistical properties of the population. For example, we want to infer from the fact that in a sample of 100, those who drank win lived on average 2 years longer, to the conclusion that on average, people in the population as a whole who drink wine live 2 years longer.

Statistical Claims Claims about the population are usually made like this: “We are 90% confident that people who drink wine live between 1.8 and 2.1 years longer than people who don’t drink wine.” They present a 90 (or 95 or 99) percent confidence interval.

Margin of Error Usually, we want a 95% confidence interval. Thus, we have a special name for 95% confidence intervals: “margins of error.” If someone says: “37% will vote for Cartes with a 3% margin of error” what they mean is that they are 95% sure that between 34% and 40% of people will vote for Cartes.

Sample Size Determination
One question we should know the answer to is how many people in the sample are needed to determine the statistical properties of the population with a low margin of error?

Law of Large Numbers Luckily, we do know that more is always better. The “Law of Large Numbers” says that if you make a large number of observations, the results should be close to the expected value. (There is no “Law of Small Numbers”)

Average of Dice Rolls

Example Let’s think about a particular problem. Suppose we are having an election between Alegre and Cartes and we want to know how many people in the population plan to vote for Alegre. How many people do we need to ask?

Non-Random Samples The first thing we should realize is that it’s not going to do us any good to ask a non-random group of people. Suppose everyone who goes to ILoveAlegre.com is voting for Alegre. If I ask them, it will seem like 100% of the population will vote for Alegre, even if only 3% will really vote for him.

Internet Polls (Important Critical Thinking Lesson: Internet polls are not trustworthy. They are biased toward people who have the internet, people who visit the site that the poll is on, and people who care enough to vote on a useless internet poll.)

Selection Bias Why do internet polls exist, if they aren’t accurate or trustworthy? Often because the people putting up the poll do not want accurate results. If I put up a poll on my blog about whether it’s wrong to deny the right of abode to helpers, then people who agree with me (the only people who read my blog) will vote. Then I will have fake “evidence” that I’m right.

Representative Samples
The opposite of a biased sample is a representative sample. A perfectly representative sample is one where if n% of the population is X, then n% of the sample is X, for every X. For example, if 10% of the population smokes, 10% of the sample smokes.

Random Sampling One way to get a representative sample is to randomly select people from the population, so that each has a fair and equal chance of ending up in the sample. For example, when we randomize our experiments, we randomly sample the participants to obtain our experimental group. (Ideally our participants are randomly sampled from the population at large.)

Problems with Random Sampling
Random sampling isn’t a cure-all, however. For example, if I randomly select 10 people from a (Western) country, on average I’ll get 5 men and 5 women. On average. But, on any particular occasion, I might select (randomly) 7 men and 3 women, or 4 men and 6 women.

Stratified Sampling One way to fix these problems would be to randomly sample 5 women and randomly sample 5 men. Then I would always have an even split between men and women, and my men would be randomly drawn from the group of men, while my women were randomly drawn from the group of women.

How Many People? So to return to the question: how many people do we need to include in a poll or an experiment before we can infer to the population at large (with a high degree of confidence that an effect is in a narrow range)? It depends on the population size!

How Many People This is the number of randomly sampled respondents one needs, in a population of N, to get an answer with a 3%, 5%, or 10% margin of error. The important thing to notice is that as the population gets bigger and bigger, the corresponding samples don’t get that much bigger.