Presentation on theme: "INTRODUCTION TO STATISTICAL CONCEPTS. Objectives Definition of “statistics” Descriptive vs. Inferential Statistics Types of Descriptive Statistics Elements."— Presentation transcript:
Objectives Definition of “statistics” Descriptive vs. Inferential Statistics Types of Descriptive Statistics Elements of Inferential Statistics Qualitative vs. Quantitative Data Data Collection Methods Inference errors from nonrandom samples
SURVEY A random sample of students taking taking a statistics class are asked, “What is your age?” Responses 23 27 25 26 22 23 21 24 30 45 21 20 25 19 35 23 25 20 31 26
DATA From this survey we get data: 2235 2126 2327 2520 2019 2524 2526 2321 2330 4531
INFORMATION In reality, data files are often very large –Much larger than this example Data is often stored in –Large computer databases –Printed records informationThe key question is, “how do we extract useful information from this data?”
What is Statistics? Statistics is a way to getINFORMATION fromDATA
Types of Statistics Two Types of Statistics Statistics Descriptive Inferential
DESCRIPTIVE STATISTICS Graphical Depictions of Data –Histograms (Bar Charts) –Pie Charts –Other Types of Charts/Graphs Numerical Descriptions/Measures of Data –Frequencies –Measures of Central Tendency –Measures of Variability
Inferential Statistics Testing hypotheses Making inferences from surveys Giving ranges for estimates Predicting the value of one variable (e.g. sales) for given values of other variables (e.g. advertising dollars) Forecasting future values over time Quality control
Basic Statistical Concepts PopulationPopulation experimental units –A set of items (experimental units) under study Parameter (Variable)Parameter (Variable) –A descriptive measure of the population that is of interest e.g. the mean (Unknown -- Use Greek letter) (Random) Sample(Random) Sample –A (random) subset chosen from the population StatisticStatistic –A descriptive measure that is calculated from the sample, e.g. the sample mean (Use regular letter)
Purpose of Inferential Statistics Making inferences about a parameterpopulation parameter of a population based on information obtained from a statisticsample statistic of the sample (With a Certain Degree of Confidence)
Example What is the average age of students taking the introductory statistics class at this university? POPULATIONPOPULATION under study –Allstudents –All students taking the statistics course at this university We may not have access to all records Even if we did, this population is constantly changing with adds/drops PARAMETERPARAMETER of interest –Average age –Average age of all students taking the course Symbol -- We can never know for sure without looking at the entire database
EXAMPLE (Continued) SAMPLETake a (random) SAMPLE –Obtain data from a random subset of the population -- i.e. randomly select 8 students taking the course and ask, “What is your age?” –Results might be: 23 22 19 35 21 25 25 26 STATISTICCalculate a STATISTIC from the sampled data sample not population –The average age of the sample of the 8 students (a statistic computed from the sample) can be calculated. This is not the average age of the population but is our best estimate of it.
CONFIDENCE The average of the sample was 24.375 What are the chances that the exact true average of all (1000(?)) students taking statistics is 24.375? –The chance is effectively 0 point estimate –But it is our best single guess (point estimate) –Pretty sure the average is within the interval 23.375 to 25.375 –Even more sure it is in the interval 21.375 to 27.375
How Large An Interval? How wide does the interval have to be before we are “reasonably sure” the interval contains the true average age of all students taking statistics? The answer to this question is one of the basic concepts of inferential statistics We will discuss this later in the course
Computing Arithmetic Statistical Values in this Course By hand/tablesBy hand/tables –It is important to know the concepts behind statistical computations and to be able to calculate basic statistical values by hand or use statistical tables in the analyses Computer (EXCEL)Computer (EXCEL) –Computer packages are a valuable aid for making tedious and/or complex calculations and for generating usable output
TYPES OF DATA Qualitative Data –Observation is nonnumeric What color is your car? Who is your favorite candidate for President? How would you rate your instructor? Quantitative Data –Observation is numeric What is your GPA? How far do you live from campus? What is your salary?
Collecting Data public sourceData can be extracted from a public source –Wall Street Journal, Orange County Business Journal designed experimentA designed experiment can be performed –Test cavity prevention – divide subjects into groups surveyA survey can be taken –Presidential poll (phone, mail), TV program (Nielsen) Observation studiesObservation studies can be made –Observe output of workers on morning/evening shifts
Goal of Data Collection To obtain a “representative sample” that exhibits the characteristics of the entire population random samplesMost common approach – taking random samples where each experimental unit in the population theoretically has the same chance of being selected for the sample
Nonrandom Sampling Errors Selection bias –One subset of experimental units in the population has either no chance, less of a chance, or more of a chance of being selected than another subset Nonresponse bias –When data is unavailable or unattainable for certain experimental units in the population Measurement errors –Inaccuracies in getting/recording data; ambiguous questions on questionnaires, etc.
Using Nonrandom Samples Unintentionally –Leads to unjustified or false conclusions Intentionally –Designed to skew results on purpose –Unethical statistical practice
REVIEW What is statistics? What is the difference between descriptive and inferential statistics? What are the elements of inferential statistics? What are the two types of data? What are four ways data are collected? What is the importance of using random samples?