David Kilgour http://www.avyg86.dsl.pipex.com/research.htm Statistics David Kilgour http://www.avyg86.dsl.pipex.com/research.htm Statistics
Learning Objectives Understand and be able to distinguish different meanings and uses of the word ‘statistics’; Be able to describe the nature of statistics as a scientific discipline; Be aware of the fact that statistics can be presented and used in misleading ways, unintentionally or even intentionally; Appreciate the importance of statistics in science and in society in general; Appreciate the need for statistics in a business and finance environment. Statistics
Goals After completing this note, you should be able to: Explain key definitions: Population vs. Sample Primary vs. Secondary Data Parameter vs. Statistic Descriptive vs. Inferential Statistics Describe key data collection methods Describe different sampling methods Probability Samples vs. Nonprobability Samples Select a random sample using a random numbers table Identify types of data and levels of measurement Describe the different types of survey error Statistics
Meaning of Statistics The word ‘statistics’ has a few different (but related) meanings, and is used in different ways, depending on the context. Most people have heard of statistics in the context of football, cricket or political opinion polls. Statistics
What Is Statistics? Data Analysis Decision- Making Why? 1. Collecting Data e.g. Survey 2. Presenting Data e.g., Charts & Tables 3. Characterizing Data e.g., Average :1, 1, 3 Decision- Making Statistics © 1984-1994 T/Maker Co.
Some Application Areas Accounting Auditing Costing Finance Financial Trends Forecasting Management Describe Employees Quality Improvement Marketing Consumer Preferences Marketing Mix Effects Statistics
Statistical Methods Statistical Methods Descriptive Inferential Statistics Statistics Statistics
Descriptive Statistics 1. Involves Collecting Data Presenting Data Characterizing Data 2. Purpose Describe Data £ 50 25 Q1 Q2 Q3 Q4 `X = 30.5 S2 = 113 Statistics
Inferential Statistics 1. Involves Estimation Hypothesis Testing 2. Purpose Make Decisions About Population Characteristics Population? Statistics
Key Terms 1. Population (Universe) 2. Sample 3. Parameter 4. Statistic All Items of Interest 2. Sample Portion of Population 3. Parameter Summary Measure about Population 4. Statistic Summary Measure about Sample Data facts or information that is relevant or appropriate to a decision maker Population the totality of objects under consideration Sample a portion of the population that is selected for analysis Parameter a summary measure (e.g., mean) that is computed to describe a characteristic of the population Statistic a summary measure (e.g., mean) that is computed to describe a characteristic of the sample Statistics
Statistical Studies Statistical Studies Enumerative Analytical Study Statistics
Enumerative Study 1. Involves Decision Making about a Population Frame is Listing of All Population Units e.g., Names in Telephone Book 2. e.g., Political Poll Statistics
Analytical Study 1. Involves Action on a Process Feedback 1. Involves Action on a Process 2. Improves Future Performance 3. No Identifiable Universe or Frame 4. e.g., Manufacturing Process Input Process Output People Product Equip- Focused Goods ment Material Process Services Focused Infor- mation Statistics
Statistical Computer Packages 1. Typical Software SAS SPSS MINITAB Excel Matlab 2. Need Statistical Understanding Assumptions Limitations Statistics
Thinking Challenge X Y Us Our market share far exceeds all competitors! - VP X Y Problem - no zero point. Maybe, a pie chart would be better. Us 30% 32% 34% 36% Statistics
Why a Manager Needs to Know about Statistics To know how to: properly present information draw conclusions about populations based on sample information improve processes obtain reliable forecasts Statistics
Key Definitions A population (universe) is the collection of all items or things under consideration A sample is a portion of the population selected for analysis A parameter is a summary measure that describes a characteristic of the population A statistic is a summary measure computed from a sample to describe a characteristic of the population Statistics
Population vs. Sample Population Sample a b c d b c ef gh i jk l m n o p q rs t u v w x y z b c g i n o r u y Measures used to describe the population are called parameters Measures computed from sample data are called statistics Statistics
Two Branches of Statistics Descriptive statistics Collecting, summarizing, and describing data Inferential statistics Drawing conclusions and/or making decisions concerning a population based only on sample data Statistics
Descriptive Statistics Collect data e.g., Survey Present data e.g., Tables and graphs Characterize data e.g., Sample mean = Statistics
Inferential Statistics Estimation e.g., Estimate the population mean weight using the sample mean weight Hypothesis testing e.g., Test the claim that the population mean weight is 120 pounds Drawing conclusions and/or making decisions concerning a population based on sample results. Statistics
Why We Need Data To provide input to survey To provide input to study To measure performance of service or production process To evaluate conformance to standards To assist in formulating alternative courses of action Statistics
Data Sources Primary Secondary Data Collection Data Compilation Print or Electronic Observation Survey Experimentation Statistics
Reasons for Drawing a Sample Less time consuming than a census Less costly to administer than a census Less cumbersome and more practical to administer than a census of the targeted population Statistics
Types of Samples Used Nonprobability Sample Probability Sample Items included are chosen without regard to their probability of occurrence Probability Sample Items in the sample are chosen on the basis of known probabilities Statistics
Non-Probability Samples Types of Samples Used (continued) Samples Non-Probability Samples Probability Samples Simple Random Stratified Judgement Chunk Cluster Systematic Quota Convenience Statistics
Probability Sampling Items in the sample are chosen based on known probabilities Probability Samples Simple Random Systematic Stratified Cluster Statistics
Simple Random Samples Every individual or item from the frame has an equal chance of being selected Selection may be with replacement or without replacement Samples obtained from table of random numbers or computer random number generators Statistics
Systematic Samples Decide on sample size: n Divide frame of N individuals into groups of k individuals: k=N/n Randomly select one individual from the 1st group Select every kth individual thereafter N = 64 n = 8 k = 8 First Group Statistics
Stratified Samples Divide population into two or more subgroups (called strata) according to some common characteristic A simple random sample is selected from each subgroup, with sample sizes proportional to strata sizes Samples from subgroups are combined into one Population Divided into 4 strata Sample Statistics
Cluster Samples Population is divided into several “clusters,” each representative of the population A simple random sample of clusters is selected All items in the selected clusters can be used, or items can be chosen from a cluster using another probability sampling technique Population divided into 16 clusters. Randomly selected clusters for sample Statistics
Advantages and Disadvantages Simple random sample and systematic sample Simple to use May not be a good representation of the population’s underlying characteristics Stratified sample Ensures representation of individuals across the entire population Cluster sample More cost effective Less efficient (need larger sample to acquire the same level of precision) Statistics
Types of Data Examples: Marital Status Political Party Eye Colour (Defined categories) Examples: Number of Children Defects per hour (Counted items) Examples: Weight Voltage (Measured characteristics) Statistics
Levels of Measurement and Measurement Scales Differences between measurements, true zero exists Highest Level Strongest forms of measurement Ratio Data Differences between measurements but no true zero Interval Data Ordered Categories (rankings, order, or scaling) Ordinal Data Higher Level Lowest Level Weakest form of measurement Categories (no ordering or direction) Nominal Data Statistics
Evaluating Survey Worthiness What is the purpose of the survey? Is the survey based on a probability sample? Coverage error – appropriate frame? No response error – follow up Measurement error – good questions elicit good responses Sampling error – always exists Statistics
Types of Survey Errors Coverage error or selection bias Exists if some groups are excluded from the frame and have no chance of being selected Non response error or bias People who do not respond may be different from those who do respond Sampling error Variation from sample to sample will always exist Measurement error Due to weaknesses in question design, respondent error, and interviewer’s effects on the respondent Statistics
Types of Survey Errors Coverage error No response error Sampling error (continued) Coverage error No response error Sampling error Measurement error Excluded from frame Follow up on no responses Random differences from sample to sample Bad or leading question Statistics
Summary Reviewed why a manager needs to know statistics Introduced key definitions: Population vs. Sample Primary vs. Secondary data types Categorical vs. Numerical data Time Series vs. Cross-Sectional data Examined descriptive vs. inferential statistics Described different types of samples Reviewed data types and measurement levels Examined survey worthiness and types of survey errors Statistics