Presentation on theme: "Elementary Statistics"— Presentation transcript:
1Elementary Statistics Chapter 1Introduction to Statistics
2Statistics Method of analysis a collection of methods for planning experiments, obtaining data, and then then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the datapage 4 of text
3What is Statistics?USA Today, December 10, The biggest study ever of the health effects of alcohol concludes that a drink a day can cut your risk of death by 20%…The researchers gave questionnaires to 490,000 men and women and then followed up nine years later, after 46,000 of them had died…[However], the benefits decreased as people drank more. Among those who averaged four or five drinks a day, the risk of death among men was 10% lower, while among women it was 7% lower.
4What is Statistics?New York Times, September 17, Millions of Americans routinely ignore one of mom’s most important pieces of advice: Wash your hands after you go to the bathroom. This unsettling item of news was gathered in the only way possible--by actually watching what people do (or don’t do) in public restrooms. The researchers--if that’s what they should be called--hid in stalls or pretended to comb their hair while observing 6,333 men and women do their business in five cities…Just 60% of those using restrooms in Penn Station (New York City)washed up afterward.
5Statistics is the science of data Statistics is the science of data. It involves collecting, classifying, summarizing, organizing, analyzing, and interpreting data.
6PopulationA population is the complete collection of all elements (scores, people, measurements, and so on) to be studied. The collection is complete in the sense that it includes all subjects to be studied.A population is the totality of all subjects possessing certain common characteristics that are being studied.In a statistical study, the researcher must define the population being studied.
7SampleA sample is a subgroup or subset of the population.
8Population and SamplePotential advertisers value television’s well-known Nielsen ratings as a barometer of a TV show’s popularity among viewers. The Nielsen rating of a certain TV program is an estimate of the proportion of viewers, expressed as a percentage, who tune their sets to the program on a given night at a given time. A typical Nielsen survey consists of 165 families selected nationwide who regularly watch television. Suppose we are interested in the Nielsen ratings for the latest episode of ER.Identify the population of interest.Describe the sample.
9DefinitionsCensusthe collection of data from every element in a populationEmphasize that a population is determined by the researcher, and a sample is a subcollection of that pre-determined group. For example, if I collect the ages from a section of elementary statistics students, that data would be a sample if I am interested in studying ages of all elementary statistics students. However, if I am studying only the ages of the specific section of elementary statistics, the data would be a population.
10Parameters and Statistics A parameter is a numerical measurement describing some characteristic of the population.Example: when Lincoln was first elected to the presidency, he received 39.82% of he 1,865,908 votes cast. If we consider the collection of all of those votes to be the population being considered, then 39.82% is a parameter, not a statistic.A statistic is a numerical measurement describing some characteristic of the sample.Example: Based on a sample of 877 surveyed executives, it was found that 45% of them would not hire anyone whose job application contained a typographical error.
12Definitions Statistic sample statistic a numerical measurement describing some characteristic of a samplesamplestatistic
13Population, Sample, and Inference Are state lottery winners who win big payoffs likely to quit their jobs within one year of winning? No, according to a study published in the Journal of the Institute for Socioeconomic Studies (Sept. 1985). The researcher mailed questionnaires to over 2,000 lottery winners who won at least $50,000 between 1975 and Of the 576 who responded, only 11% had quit their jobs during the first year after striking it rich. In this study, identifyThe populationThe sampleThe inference made about the population
14DataData are obtained by measuring some characteristic or property of the objects (usually people or things) of interest to us.A variable is a characteristic (Property) that differs or varies from one observation from the next.All data (and, consequently, the variables we measure) are either quantitative or qualitative.
15Qualitative/Quantitative Data Quantitative data are observations measured on a natural numerical scale.Nonnumeric data that can only be classified into one of a group of categories are qualitative data.State whether each of the following variables measured on graduating high school students is quantitative or qualitative.National Honor Society member or notScholastic Assessment Test (SAT) scoreNumber of colleges applied toPart-time job status
16Definitions Quantitative data numbers representing counts or measurementsQualitative (or categorical or attribute) datacan be separated into different categories that are distinguished by some nonnumeric characteristics
17Definitions Quantitative data the incomes of college graduatesQualitative (or categorical or attribute) datathe genders (male/female) of college graduates
18Discrete vs Continuous Data Discrete data result when the number of possible values is either a finite number or a “countable” number.Continuous (numerical) data result from infinitely many possible values that correspond to some continuous scale hat covers a range of values without gaps, interruptions, or jumps. Continuous data is measurable.
19DefinitionsDiscretedata result when the number of possible values is either a finite number or a ‘countable’ number of possible values0, 1, 2, 3, . . .
20Definitions Discrete Continuous data result when the number of possible values is either a finite number or a ‘countable’ number of possible values0, 1, 2, 3, . . .Continuous(numerical) data result from infinitely many possible values that correspond to some continuous scale that covers a range of values without gaps, interruptions, or jumpsUnderstanding the difference between discrete versus continuous data will be important in Chapters 4 and 5.When measuring data that is continuous, the result will be only as precise as the measuring device being used to measure.23
21Determine whether the given values are from a discrete or continuous data set. A statistics professor counts 3 absent students.A statistics professor finds that on the first test, the first paper is turned in minutes after the test began.In a survey of 1068 Americans, 73 state that they own answering machines.A manufacturer of rechargeable calculator batteries finds that one batch consists of 850 good batteries and 7 that are defective.
22Definitions nominal level of measurement characterized by data that consist of names, labels, or categories only. The data cannot be arranged in an ordering scheme (such as low to high)Example: survey responses yes, no, undecided
23Definitions ordinal level of measurement involves data that may be arranged in some order, but differences between data values either cannot be determined or are meaninglessExample: Course grades A, B, C, D, or F
24Definitions interval level of measurement like the ordinal level, with the additional property that the difference between any two data values is meaningful. However, there is no natural zero starting point (where none of the quantity is present)Example: Years 1000, 2000, 1776, and 1492
25Definitions ratio level of measurement the interval level modified to include the natural zero starting point (where zero indicates that none of the quantity is present). For values at this level, differences and ratios are meaningful.Example: Prices of college textbooks
26Levels of Measurement Nominal - categories only Ordinal - categories with some orderInterval - differences but no natural starting pointRatio - differences and a natural starting point
27Uses of StatisticsAlmost all fields of study benefit from the application of statistical methods
29Definitions self-selected survey (or voluntary response sample) one in which the respondents themselves decide whether to be included
30Abuses of Statistics Bad Samples Small Samples Loaded Questions Misleading Graphs
31Figure 1-1 Salaries of People with Bachelor’s Degrees and with High School Diplomas $40,500$40,500$40,000$40,00035,00030,000$24,40030,00020,000$24,40025,00010,00020,000Bachelor High SchoolDegree DiplomaBachelor High SchoolDegree Diploma(a)(b)
32We should analyze the numerical information given in the graph instead of being mislead by its general shape.
33Abuses of Statistics Bad Samples Small Samples Loaded Questions Misleading GraphsPictographs
34Double the length, width, and height of a cube, and the volume increases by a factor of eight Figure 1-2
35Abuses of Statistics Bad Samples Small Samples Loaded Questions Misleading GraphsPictographsPrecise NumbersDistorted PercentagesPartial Pictures
36“Ninety percent of all our cars sold in this country in the last 10 years are still on the road.”
37Abuses of Statistics Bad Samples Small Samples Loaded Questions Misleading GraphsPictographsPrecise NumbersDistorted PercentagesPartial PicturesDeliberate Distortions
38Definitions Observational Study observing and measuring specific characteristics without attempting to modify the subjects being studied
39Definitions Experiment apply some treatment and then observe its effects on the subjects
40Designing an Experiment Identify your objectiveCollect sample dataUse a random procedure that avoids biasAnalyze the data and form conclusions
41Definitions Confounding occurs in an experiment when the effects from two or more variables cannot be distinguished from each other
42Definitions Replication used when an experiment is repeated on a sample of subjects that is large enough so that we can see the true nature of any effects (instead of being misled by erratic behavior of samples that are too small)
43Definitions Random Sample members of the population are selected in such a way that each has an equal chance of being selected
44Definitions Random Sample Simple Random Sample (of size n) members of the population are selected in such a way that each has an equal chance of being selectedSimple Random Sample (of size n)subjects selected in such a way that every possible sample of size n has the same chance of being chosen
45Random Sampling - selection so that each has an equal chance of being selected
46Systematic Sampling - Select some starting point and then select every K th element in the population
47Convenience Sampling - use results that are readily available Hey!Do you believein the deathpenalty?
48Stratified Sampling - subdivide the population into subgroups that share the same characteristic, then draw a sample from each stratum
49Cluster Sampling - divide the population into sections (or clusters); randomly select some of those clusters; choose all members from selected clusters
50Methods of SamplingRandomSystematicConvenienceStratifiedCluster
51Definitions Sampling Error Nonsampling Error the difference between a sample result and the true population result; such an error results from chance sample fluctuations.Nonsampling Errorsample data that are incorrectly collected, recorded, or analyzed (such as by selecting a biased sample, using a defective instrument, or copying the data incorrectly).