Statistics can be thought of as a whole subject or discipline... It can be thought of as the methods used to collect, process and/or interpret data... It can be thought of as the collections of data gathered by those methods... It can also be thought of as a specially calculated figures (e.g. averages) to characterize collection...
Statistics are like a bikini; What is revealed is interesting; What is concealed is crucial. - R. Taylor Statistics is the science and art of making decisions based on quantitative evidence.
“The most fundamental principle of all in gambling is simply equal conditions, e.g. of opponents, of bystanders, of money, of situation, of the dice box, and of the die itself. To the extent to which you depart from that equality, if it is in your opponent’s favor, you are a fool, and if in your own, you are unjust.” Girolamo Cardano 1501 – 1576
PROPOSITION IV “Suppose now that I am playing against someone with the agreement that the first of us to win three times will take the stake. And suppose that I have already won twice and my opponent has already won once. I want to know how much of the money should fall to me if we do not wish to continue the game, but rather to divide equitably the money we are playing for.” Christiann Huygens 1629 – 1695
Descriptive Statistics Involves organizing, summarizing, and displaying data. Inferential Statistics Involves using sample data to draw conclusions about a population.
The objective of descriptive statistics methods is to summarize a set of observations. The objective of inferential statistics methods is to make inferences (predictions, decisions) about population based on information contained in a sample, and to quantify the level of uncertainty in our decisions.
Decide which part of the study represents the descriptive branch of statistics. What conclusions might be drawn from the study using inferential statistics? A large sample of men, aged 48, was studied for 18 years. For unmarried men, approximately 70% were alive at age 65. For married men, 90% were alive at age 65.
Descriptive statistics involves statements such as “For unmarried men, approximately 70% were alive at age 65” and “For married men, 90% were alive at 65.” A possible inference drawn from the study is that being married is associated with a longer life for men.
We will follow Logic Statistics (Descriptive) Probability Statistics (Inferential)
An ARGUMENT is a sequence of statements, one of which is called the CONCLUSION. The other statements are PREMISES (assumptions). The argument presents the premises—collectively— as evidence that the conclusion is true. Example: If A is true then B is true. A is true. Therefore, B is true.
The CONCLUSION is that B is true. The PREMISES are If A is true then B is true and A is true. The premises support the conclusion that B is true. The word "therefore" is not part of the conclusion: It is a signal that the statement after it is the conclusion. The words thus, hence, so, and the phrases it follows that, we see that, and so on, also flag conclusions. The words suppose, let, given, assume, and so on, flag premises. A concrete argument of the form just given might be: If it is sunny, I will wear sandals. It is sunny. Therefore, I will wear sandals. Here, A is "it is sunny" and B is "I will wear sandals." We usually omit the words "is true." So, for example, the previous argument would be written If A then B. A. Therefore, B. The statement not A means A is false.
An argument is VALID if the conclusion must be true whenever the premises are true. If an argument is valid and its premises are true, the argument is SOUND. Cheese more than a billion years old is stale. The Moon is made of cheese. The Moon is more than a billion years old. Therefore, the Moon is stale cheese. VALID but NOT SOUND!
A or not A. (L AW OF THE EXCLUDED MIDDLE ) Not (A and not A). A. Therefore, A or B. A. B. Therefore, A and B. A and B. Therefore, A. Not A. Therefore, not (A and B). A or B. Not A. Therefore, B. ( DENYING THE DISJUNCT ) Not (A and B). Therefore, (not A) or (not B). ( DE M ORGAN ) Not (A or B). Therefore, (not A) and (not B). ( DE M ORGAN ) If A then B. A. Therefore, B. ( AFFIRMING THE PRECEDENT, MODUS PONENDO PONENS, "affirming by affirming") If A then B. Not B. Therefore, not A. ( DENYING THE CONSEQUENT, MODUS TOLLENDO TOLLENS, "denying by denying")
A or B. Therefore, A. A or B. A. Therefore, not B. ( AFFIRMING THE DISJUNCT ) N OT BOTH A AND B ARE TRUE. N OT A. T HEREFORE, B. I F A THEN B. B. T HEREFORE, A. I F A THEN B. N OT A. T HEREFORE, NOT B. I F A THEN B. C. T HEREFORE, B. I F A THEN B. N OT C. T HEREFORE, NOT A. I F A THEN B. A. T HEREFORE, C. I F A THEN B. N OT B. T HEREFORE, NOT C.
N ANCY CLAIMS THE DEATH PENALTY IS A GOOD THING. B UT N ANCY ONCE SET FIRE TO A VACANT WAREHOUSE. N ANCY IS EVIL. T HEREFORE, THE DEATH PENALTY IS A BAD THING. T HIS ARGUMENT DOES NOT ADDRESS N ANCY ' S ARGUMENT, IT JUST SAYS SHE MUST BE WRONG ( ABOUT EVERYTHING ) BECAUSE SHE IS EVIL. W HETHER N ANCY IS GOOD OR EVIL IS IRRELEVANT : I T HAS NO BEARING ON WHETHER HER ARGUMENT IS SOUND. T HIS IS A FALLACY OF RELEVANCE : I T ESTABLISHES THAT N ANCY IS BAD, THEN EQUATES BEING BAD AND NEVER BEING RIGHT. I N SYMBOLS, THE ARGUMENT IS I F A THEN B. A. T HEREFORE C. (I F SOMEBODY SETS FIRE TO A VACANT WAREHOUSE, THAT PERSON IS EVIL. N ANCY SET FIRE TO A VACANT WAREHOUSE. T HEREFORE, N ANCY ' S OPINION ABOUT THE DEATH PENALTY IS WRONG.) A D HOMINEM IS L ATIN FOR " TOWARDS THE PERSON." A N AD HOMINEM ARGUMENT ATTACKS THE PERSON MAKING THE CLAIM, RATHER THAN THE PERSON ' S REASONING. A VARIANT OF THE AD HOMINEM ARGUMENT IS " GUILT BY ASSOCIATION."
B OB CLAIMS THE DEATH PENALTY IS A GOOD THING. B UT B OB ' S FAMILY BUSINESS MANUFACTURES CASKETS. B OB BENEFITS WHEN PEOPLE DIE, SO HIS MOTIVES ARE SUSPECT. T HEREFORE, THE DEATH PENALTY IS A BAD THING. T HIS ARGUMENT DOES NOT ADDRESS B OB ' S ARGUMENT, IT ADDRESSES B OB ' S MOTIVES. H IS MOTIVES ARE IRRELEVANT : T HEY HAVE NOTHING TO DO WITH WHETHER HIS ARGUMENT FOR THE DEATH PENALTY IS SOUND. T HIS IS RELATED TO AN AD HOMINEM ARGUMENT. I T, TOO, ADDRESSES THE PERSON, NOT THE PERSON ' S ARGUMENT. H OWEVER, RATHER THAN CONDEMNING B OB AS EVIL, IT IMPUGNS HIS MOTIVES IN ARGUING FOR THIS PARTICULAR CONCLUSION.
A MY SAYS PEOPLE SHOULDN ' T SMOKE CIGARETTES IN PUBLIC BECAUSE CIGARETTE SMOKE HAS A STRONG ODOR. B UT A MY WEARS STRONG PERFUME ALL THE TIME. A MY IS CLEARLY A HYPOCRITE. T HEREFORE, SMOKING IN PUBLIC IS FINE. T HIS ARGUMENT DOES NOT ENGAGE A MY ' S ARGUMENT : I T ATTACKS HER FOR THE ( IN ) CONSISTENCY OF HER OPINIONS IN THIS MATTER AND IN SOME OTHER MATTER. W HETHER A MY WEARS STRONG FRAGRANCES HAS NOTHING TO DO WITH WHETHER HER ARGUMENT AGAINST SMOKING IS SOUND.
Y ES, I HIT B ILLY. B UT S ALLY HIT HIM FIRST. T HIS ARGUMENT CLAIMS IT IS FINE TO DO SOMETHING WRONG BECAUSE SOMEBODY ELSE DID SOMETHING WRONG. T HE ARGUMENT IS OF THE FORM : I F A THEN B. A. T HEREFORE C. (I N WORDS : I F S ALLY HIT B ILLY, IT ' S OK FOR B ILLY TO HIT S ALLY. S ALLY HIT B ILLY. T HEREFORE, IT ' S OK FOR ME TO HIT B ILLY.) G ENERALLY, THE TWO - WRONGS - MAKE - A - RIGHT ARGUMENT SAYS THAT THE JUSTIFIED WRONG HAPPENED AFTER THE EXCULPATORY WRONG, OR WAS LESS SEVERE. F OR INSTANCE, S ALLY HIT B ILLY FIRST, OR S ALLY HIT B ILLY HARDER THAN I DID, OR S ALLY PULLED A KNIFE ON B ILLY.
I F YOU DON ' T GIVE ME YOUR LUNCH MONEY, MY BIG BROTHER WILL BEAT YOU UP. Y OU DON ' T WANT TO BE BEATEN UP, DO YOU ? T HEREFORE, YOU SHOULD GIVE ME YOUR LUNCH MONEY. T HIS ARGUMENT APPEALS TO FORCE : A CCEPT MY CONCLUSION — OR ELSE. I T IS NOT A LOGICAL ARGUMENT. [+17] N OTE 2-17: B UT IT CAN BE QUITE PERSUASIVE NONETHELESS. I T IS AN ARGUMENT THAT IF YOU DO NOT ACCEPT THE CONCLUSION ( AND GIVE ME YOUR LUNCH MONEY ), SOMETHING BAD WILL HAPPEN ( YOU WILL GET BEATEN )— NOT AN ARGUMENT THAT THE CONCLUSION IS CORRECT. T HE FORM OF THE ARGUMENT IS I F A THEN B. B IS BAD. T HEREFORE, NOT A. H ERE, A IS " YOU DON ' T GIVE ME YOUR LUNCH MONEY," B IS " YOU WILL BE BEATEN UP."
Y ES, I DOWNLOADED MUSIC ILLEGALLY — BUT MY GIRLFRIEND LEFT ME AND I LOST MY JOB SO I WAS BROKE AND I COULDN ' T AFFORD TO BUY MUSIC AND I WAS SO SAD THAT I WAS BROKE AND THAT MY GIRLFRIEND WAS GONE THAT I REALLY HAD TO LISTEN TO 100 VARIATIONS OF S HE CAUGHT THE K ATY. T HIS ARGUMENT JUSTIFIES AN ACTION NOT BY CLAIMING THAT IT IS CORRECT, BUT BY AN APPEAL TO PITY : EXTENUATING CIRCUMSTANCES OF A SORT. A D MISERICORDIUM IS L ATIN FOR " TO PITY." I T IS AN APPEAL TO COMPASSION RATHER THAN TO REASON. A NOTHER EXAMPLE : Y ES, I FAILED THE FINAL. B UT I NEED TO GET AN A IN THE CLASS OR I [ WON ' T GET INTO B USINESS SCHOOL ] / [ WILL LOSE MY SCHOLARSHIP ] / [ WILL VIOLATE MY ACADEMIC PROBATION ] / [ WILL LOSE MY 4.0 GPA]. Y OU HAVE TO GIVE ME AN A!
M ILLIONS OF PEOPLE SHARE COPYRIGHTED MP 3 FILES AND VIDEOS ONLINE. T HEREFORE, SHARING COPYRIGHTED MUSIC AND VIDEOS IS FINE. T HIS " BANDWAGON " ARGUMENT CLAIMS THAT SOMETHING IS MORAL BECAUSE IT IS COMMON. C OMMON AND CORRECT ARE NOT THE SAME. W HETHER A PRACTICE IS WIDESPREAD HAS LITTLE BEARING ON WHETHER IT IS LEGAL OR MORAL. T HAT MANY PEOPLE BELIEVE SOMETHING IS TRUE DOES NOT MAKE IT TRUE. A D POPULUM IS L ATIN FOR " TO THE PEOPLE." I T EQUATES THE POPULARITY OF AN IDEA WITH THE TRUTH OF THE IDEA : E VERYBODY CAN ' T BE WRONG. F EW TEENAGERS HAVE NOT MADE AD POPULUM ARGUMENTS : "B UT M OM, EVERYBODY IS DOING IT !"
B OB : S LEEPING A FULL 12 HOURS ONCE IN A WHILE IS A HEALTHY PLEASURE. S AMANTHA : I F EVERYBODY SLEPT 12 HOURS ALL THE TIME, NOTHING WOULD EVER GET DONE ; THE REDUCTION IN PRODUCTIVITY WOULD DRIVE THE COUNTRY INTO BANKRUPTCY. T HEREFORE, NOBODY SHOULD SLEEP FOR 12 HOURS. S AMANTHA ATTACKED A DIFFERENT CLAIM FROM THE ONE B OB MADE : S HE ATTACKED THE ASSERTION THAT IT IS GOOD FOR EVERYBODY TO SLEEP 12 HOURS EVERY DAY. B OB ONLY CLAIMED THAT IS WAS GOOD ONCE IN A WHILE.
A RT : T EACHER SALARIES SHOULD BE INCREASED TO ATTRACT BETTER TEACHERS. B ETTE : L ENGTHENING THE SCHOOL DAY WOULD ALSO IMPROVE STUDENT LEARNING OUTCOMES. T HEREFORE, TEACHER SALARIES SHOULD REMAIN THE SAME. A RT ARGUES THAT INCREASING TEACHER SALARIES WOULD ATTRACT BETTER TEACHERS. B ETTE DOES NOT ADDRESS HIS ARGUMENT : S HE SIMPLY ARGUES THAT THERE ARE OTHER WAYS OF IMPROVING STUDENT LEARNING OUTCOMES. A RT DID NOT EVEN USE STUDENT LEARNING OUTCOMES AS A REASON FOR INCREASING TEACHER SALARIES. E VEN IF B ETTE IS CORRECT THAT LENGTHENING THE SCHOOL DAY WOULD IMPROVE LEARNING OUTCOMES, HER ARGUMENT IS SIDEWAYS TO A RT ' S : I T IS A DISTRACTION, NOT A REFUTATION. A RED HERRING ARGUMENT DISTRACTS THE LISTENER FROM THE REAL TOPIC R ED HERRING ARGUMENTS ARE VERY COMMON IN POLITICAL DISCOURSE.
A LL MEN SHOULD HAVE THE RIGHT TO VOTE. S ALLY IS NOT A MAN. T HEREFORE, S ALLY SHOULD NOT NECESSARILY HAVE THE RIGHT TO VOTE. T HIS IS AN EXAMPLE OF EQUIVOCATION, A FALLACY FACILITATED BY THE FACT THAT A WORD CAN HAVE MORE THAN ONE MEANING. T HIS ARGUMENT USES THE WORD MAN IN TWO DIFFERENT WAYS. I N THE FIRST PREMISE, THE WORD MEANS HUMAN WHILE IN THE SECOND, IT MEANS MALE. G ENERALLY, EQUIVOCATION IS CONSIDERED A FALLACY OF RELEVANCE, BUT THIS EXAMPLE FITS OUR DEFINITION OF A FALLACY OF EVIDENCE. T HE LOGICAL FORM OF THIS ARGUMENT IS I F A THEN B. N OT C. T HEREFORE, B IS NOT NECESSARILY TRUE.
Trident (4/5) Trident® sugarless gum used to advertise that "4 out of 5 dentists surveyed recommend Trident® sugarless gum for their patients who chew gum." Yale University Graduates
In its broadest sense, Statistics is the science of drawing conclusions about the world from data. Data are observations (measurements) of some quantity or quality of something in the world. "Data" is a plural noun; the singular form is "datum." Our lives are filled with data: the weather, weights, prices, our state of health, exam grades, bank balances, election results, and so on. Data come in many forms, most of which are numbers, or can be translated into numbers for analysis.
There are several important questions to keep in mind when you evaluate quantitative evidence: Are the data relevant to the question asked? Was the data collection fair, or might there have been some conscious or unconscious BIAS that influenced the results or made some cases less likely to be observed? BIAS Do the data make sense?
Qualitative Data : Consists of attributes, labels, or nonnumerical entries. MajorPlace of birth Eye color
Hot/Warm/Cold Population density: low/medium/high Height: short/medium/tall Young/Middle-aged/Old Social class: lower/middle/upper Family size: fewer than 3, 3–5, 5 or more Rural/Urban area Type of climate Gender Ethnicity Zip code Hair color Country of origin
Quantitative Data : Numerical measurements or counts. AgeWeight of a letterTemperature
Temperature in °C Population density: people per square mile Height in inches Height in centimeters Body mass index (BMI) Age in seconds Income in dollars Family size (#people)
The base prices of several vehicles are shown in the table. Which data are qualitative data and which are quantitative data? (Source Ford Motor Company)
Quantitative Data (Base prices of vehicles models are numerical entries) Qualitative Data (Names of vehicle models are nonnumerical entries)
The fact that a category is labeled with a number does not make the variable quantitative! The real issue is whether arithmetic with the values makes sense.
Nominal level of measurement Qualitative data only Categorized using names, labels, or qualities No mathematical computations can be made Ordinal level of measurement Qualitative or quantitative data Data can be arranged in order Differences between data entries is not meaningful
Two data sets are shown. Which data set consists of data at the nominal level? Which data set consists of data at the ordinal level? (Source: Nielsen Media Research)
Ordinal level (lists the rank of five TV programs. Data can be ordered. Difference between ranks is not meaningful.) Nominal level (lists the call letters of each network affiliate. Call letters are names of network affiliates.)
Interval level of measurement Quantitative data Data can be ordered Differences between data entries is meaningful Zero represents a position on a scale (not an inherent zero – zero does not imply “none”)
Ratio level of measurement Similar to interval level Zero entry is an inherent zero (implies “none”) A ratio of two data values can be formed One data value can be expressed as a multiple of another
Two data sets are shown. Which data set consists of data at the interval level? Which data set consists of data at the ratio level? (Source: Major League Baseball)
Interval level (Quantitative data. Can find a difference between two dates, but a ratio does not make sense.) Ratio level (Can find differences and write ratios.)
Level of Measurement Put data in categories Arrange data in order Subtract data values Determine if one data value is a multiple of another NominalYesNo OrdinalYes No IntervalYes No RatioYes
One of the most problematic relationship. What is really a variable? What is value? What is data? How they are related? Variable Values Theoretical Observed Data
Variable: Variable: New York Yankees’ World Series Victories Values: Values: 1901,1902,…(all possible years) Data: Data: 1923,1927,1928,…