# Elementary Statistics

## Presentation on theme: "Elementary Statistics"— Presentation transcript:

Elementary Statistics
Chapter 1 Introduction to Statistics

Statistics Method of analysis
a collection of methods for planning experiments, obtaining data, and then then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data page 4 of text

What is Statistics? USA Today, December 10, The biggest study ever of the health effects of alcohol concludes that a drink a day can cut your risk of death by 20%…The researchers gave questionnaires to 490,000 men and women and then followed up nine years later, after 46,000 of them had died…[However], the benefits decreased as people drank more. Among those who averaged four or five drinks a day, the risk of death among men was 10% lower, while among women it was 7% lower.

What is Statistics? New York Times, September 17, Millions of Americans routinely ignore one of mom’s most important pieces of advice: Wash your hands after you go to the bathroom. This unsettling item of news was gathered in the only way possible--by actually watching what people do (or don’t do) in public restrooms. The researchers--if that’s what they should be called--hid in stalls or pretended to comb their hair while observing 6,333 men and women do their business in five cities…Just 60% of those using restrooms in Penn Station (New York City)washed up afterward.

Statistics is the science of data
Statistics is the science of data. It involves collecting, classifying, summarizing, organizing, analyzing, and interpreting data.

Population A population is the complete collection of all elements (scores, people, measurements, and so on) to be studied. The collection is complete in the sense that it includes all subjects to be studied. A population is the totality of all subjects possessing certain common characteristics that are being studied. In a statistical study, the researcher must define the population being studied.

Sample A sample is a subgroup or subset of the population.

Population and Sample Potential advertisers value television’s well-known Nielsen ratings as a barometer of a TV show’s popularity among viewers. The Nielsen rating of a certain TV program is an estimate of the proportion of viewers, expressed as a percentage, who tune their sets to the program on a given night at a given time. A typical Nielsen survey consists of 165 families selected nationwide who regularly watch television. Suppose we are interested in the Nielsen ratings for the latest episode of ER. Identify the population of interest. Describe the sample.

Definitions Census the collection of data from every element in a population Emphasize that a population is determined by the researcher, and a sample is a subcollection of that pre-determined group. For example, if I collect the ages from a section of elementary statistics students, that data would be a sample if I am interested in studying ages of all elementary statistics students. However, if I am studying only the ages of the specific section of elementary statistics, the data would be a population.

Parameters and Statistics
A parameter is a numerical measurement describing some characteristic of the population. Example: when Lincoln was first elected to the presidency, he received 39.82% of he 1,865,908 votes cast. If we consider the collection of all of those votes to be the population being considered, then 39.82% is a parameter, not a statistic. A statistic is a numerical measurement describing some characteristic of the sample. Example: Based on a sample of 877 surveyed executives, it was found that 45% of them would not hire anyone whose job application contained a typographical error.

Definitions Statistic sample statistic
a numerical measurement describing some characteristic of a sample sample statistic

Population, Sample, and Inference
Are state lottery winners who win big payoffs likely to quit their jobs within one year of winning? No, according to a study published in the Journal of the Institute for Socioeconomic Studies (Sept. 1985). The researcher mailed questionnaires to over 2,000 lottery winners who won at least \$50,000 between 1975 and Of the 576 who responded, only 11% had quit their jobs during the first year after striking it rich. In this study, identify The population The sample The inference made about the population

Data Data are obtained by measuring some characteristic or property of the objects (usually people or things) of interest to us. A variable is a characteristic (Property) that differs or varies from one observation from the next. All data (and, consequently, the variables we measure) are either quantitative or qualitative.

Qualitative/Quantitative Data
Quantitative data are observations measured on a natural numerical scale. Nonnumeric data that can only be classified into one of a group of categories are qualitative data. State whether each of the following variables measured on graduating high school students is quantitative or qualitative. National Honor Society member or not Scholastic Assessment Test (SAT) score Number of colleges applied to Part-time job status

Definitions Quantitative data
numbers representing counts or measurements Qualitative (or categorical or attribute) data can be separated into different categories that are distinguished by some nonnumeric characteristics

Definitions Quantitative data
the incomes of college graduates Qualitative (or categorical or attribute) data the genders (male/female) of college graduates

Discrete vs Continuous Data
Discrete data result when the number of possible values is either a finite number or a “countable” number. Continuous (numerical) data result from infinitely many possible values that correspond to some continuous scale hat covers a range of values without gaps, interruptions, or jumps. Continuous data is measurable.

Definitions Discrete data result when the number of possible values is either a finite number or a ‘countable’ number of possible values 0, 1, 2, 3, . . .

Definitions Discrete Continuous
data result when the number of possible values is either a finite number or a ‘countable’ number of possible values 0, 1, 2, 3, . . . Continuous (numerical) data result from infinitely many possible values that correspond to some continuous scale that covers a range of values without gaps, interruptions, or jumps Understanding the difference between discrete versus continuous data will be important in Chapters 4 and 5. When measuring data that is continuous, the result will be only as precise as the measuring device being used to measure. 2 3

Determine whether the given values are from a discrete or continuous data set.
A statistics professor counts 3 absent students. A statistics professor finds that on the first test, the first paper is turned in minutes after the test began. In a survey of 1068 Americans, 73 state that they own answering machines. A manufacturer of rechargeable calculator batteries finds that one batch consists of 850 good batteries and 7 that are defective.

Definitions nominal level of measurement
characterized by data that consist of names, labels, or categories only. The data cannot be arranged in an ordering scheme (such as low to high) Example: survey responses yes, no, undecided

Definitions ordinal level of measurement
involves data that may be arranged in some order, but differences between data values either cannot be determined or are meaningless Example: Course grades A, B, C, D, or F

Definitions interval level of measurement
like the ordinal level, with the additional property that the difference between any two data values is meaningful. However, there is no natural zero starting point (where none of the quantity is present) Example: Years 1000, 2000, 1776, and 1492

Definitions ratio level of measurement
the interval level modified to include the natural zero starting point (where zero indicates that none of the quantity is present). For values at this level, differences and ratios are meaningful. Example: Prices of college textbooks

Levels of Measurement Nominal - categories only
Ordinal - categories with some order Interval - differences but no natural starting point Ratio - differences and a natural starting point

Uses of Statistics Almost all fields of study  benefit from the application  of statistical methods

Definitions self-selected survey (or voluntary response sample)
one in which the respondents themselves decide whether to be included

Figure 1-1 Salaries of People with Bachelor’s Degrees and with High School Diplomas
\$40,500 \$40,500 \$40,000 \$40,000 35,000 30,000 \$24,400 30,000 20,000 \$24,400 25,000 10,000 20,000 Bachelor High School Degree Diploma Bachelor High School Degree Diploma (a) (b)

We should analyze the numerical information given in the graph instead of being mislead by its general shape.

Double the length, width, and height of a cube, and the volume increases by a factor of eight
Figure 1-2

Misleading Graphs Pictographs Precise Numbers Distorted Percentages Partial Pictures

“Ninety percent of all our cars sold in this country in the last 10 years are still on the road.”

Misleading Graphs Pictographs Precise Numbers Distorted Percentages Partial Pictures Deliberate Distortions

Definitions Observational Study
observing and measuring specific characteristics without attempting to modify the subjects being studied

Definitions Experiment
apply some treatment and then observe its effects on the subjects

Designing an Experiment
Identify your objective Collect sample data Use a random procedure that   avoids bias Analyze the data and form   conclusions

Definitions Confounding
occurs in an experiment when the effects from two or more variables cannot be distinguished from each other

Definitions Replication
used when an experiment is repeated on a sample of subjects that is large enough so that we can see the true nature of any effects (instead of being misled by erratic behavior of samples that are too small)

Definitions Random Sample
members of the population are selected in such a way that each has an equal chance of being selected

Definitions Random Sample Simple Random Sample (of size n)
members of the population are selected in such a way that each has an equal chance of being selected Simple Random Sample (of size n) subjects selected in such a way that every possible sample of size n has the same chance of being chosen

Random Sampling - selection so that each has an equal chance of being selected

Systematic Sampling - Select some starting point and then select every K th element in the population

Convenience Sampling - use results that are readily available
Hey! Do you believe in the death penalty?

Stratified Sampling - subdivide the population into subgroups that share the same characteristic, then draw a sample from each stratum

Cluster Sampling - divide the population into sections (or clusters); randomly select some of those clusters; choose all members from selected clusters

Methods of Sampling Random Systematic Convenience Stratified Cluster

Definitions Sampling Error Nonsampling Error
the difference between a sample result and the true population result; such an error results from chance sample fluctuations. Nonsampling Error sample data that are incorrectly collected, recorded, or analyzed (such as by selecting a biased sample, using a defective instrument, or copying the data incorrectly).