Prepared by Lloyd R. Jaisingh

Slides:



Advertisements
Similar presentations
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 1 An Introduction to Business Statistics.
Advertisements

BCOR 1020 Business Statistics
Statistics for Managers Using Microsoft® Excel 5th Edition
Elementary Statistics MOREHEAD STATE UNIVERSITY
© 2003 Prentice-Hall, Inc.Chap 1-1 Business Statistics: A First Course (3 rd Edition) Chapter 1 Introduction and Data Collection.
Chapter 1 The Where, Why, and How of Data Collection
© 2004 Prentice-Hall, Inc.Chap 1-1 Basic Business Statistics (9 th Edition) Chapter 1 Introduction and Data Collection.
MISUNDERSTOOD AND MISUSED
Who and How And How to Mess It up
Sampling.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 1-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Sampling and Randomness
CHAPTER twelve Basic Sampling Issues Copyright © 2002
Chapter 1 The Where, Why, and How of Data Collection
11 Populations and Samples.
Chapter 1 The Where, Why, and How of Data Collection
The Excel NORMDIST Function Computes the cumulative probability to the value X Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc
Sampling Methods.
Sampling Moazzam Ali.
PowerPoint Presentation Package to Accompany:
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 9 Processing the Data.
Chapter 3 Goals After completing this chapter, you should be able to: Describe key data collection methods Know key definitions:  Population vs. Sample.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 1-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Chapter 1 Getting Started
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 1: Introduction to Statistics
Sampling: Theory and Methods
Chapter 1 Introduction and Data Collection
© The McGraw-Hill Companies, Inc., by Marc M. Triola & Mario F. Triola SLIDES PREPARED BY LLOYD R. JAISINGH MOREHEAD STATE UNIVERSITY MOREHEAD.
© Copyright McGraw-Hill CHAPTER 1 The Nature of Probability and Statistics.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 1 An Introduction to Business Statistics.
Statistics: Basic Concepts. Overview Survey objective: – Collect data from a smaller part of a larger group to learn something about the larger group.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Chapter 1: The Nature of Statistics
Sampling Methods. Definition  Sample: A sample is a group of people who have been selected from a larger population to provide data to researcher. 
7-1 Chapter Seven SAMPLING DESIGN. 7-2 Selection of Elements Population Element the individual subject on which the measurement is taken; e.g., the population.
1  Specific number numerical measurement determined by a set of data Example: Twenty-three percent of people polled believed that there are too many polls.
1 Hair, Babin, Money & Samouel, Essentials of Business Research, Wiley, Learning Objectives: 1.Understand the key principles in sampling. 2.Appreciate.
Introduction Biostatistics Analysis: Lecture 1 Definitions and Data Collection.
Sampling “Sampling is the process of choosing sample which is a group of people, items and objects. That are taken from population for measurement and.
Ch.1 INTRODUCTION TO STATISTICS Prepared by: M.S Nurzaman, MIDEc. ( deden )‏ (021) /
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
Tahir Mahmood Lecturer Department of Statistics. Outlines: E xplain the role of sampling in the research process D istinguish between probability and.
An Overview of Statistics Section 1.1. Ch1 Larson/Farber 2 Statistics is the science of collecting, organizing, analyzing, and interpreting data in order.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Data Collection Data Collection Definitions Level of Measurement Time Series and Cross-sectional Data Sampling Concepts Sampling Methods Data Sources Survey.
General Business 704 Data Analysis for Managers Introduction The Course, Data, and Excel.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling and Sampling Distributions.
7: Sampling Theory and Methods. 7-2 Copyright © 2008 by the McGraw-Hill Companies, Inc. All rights reserved. Hair/Wolfinbarger/Ortinau/Bush, Essentials.
Chapter Eleven Sampling: Design and Procedures Copyright © 2010 Pearson Education, Inc
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. A PowerPoint Presentation Package to Accompany Applied Statistics.
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 1-1 A Course In Business Statistics 4 th Edition Chapter 1 The Where, Why, and How.
1 Data Collection and Sampling Chapter Methods of Collecting Data The reliability and accuracy of the data affect the validity of the results.
Chap 1-1 Chapter 3 Goals After completing this chapter, you should be able to: Describe key data collection methods Know key definitions:  Population.
1 Introduction to Statistics. 2 What is Statistics? The gathering, organization, analysis, and presentation of numerical information.
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-1 Inferential Statistics for Forecasting Dr. Ghada Abo-zaid Inferential Statistics for.
1 Data Collection and Sampling ST Methods of Collecting Data The reliability and accuracy of the data affect the validity of the results of a statistical.
Chapter 1 Getting Started What is Statistics?. Individuals vs. Variables Individuals People or objects included in the study Variables Characteristic.
Data Collection. At the end of this lesson, the student should be able to:  1. recognize the importance of data gathering;  2. distinguish primary from.
Learning Objectives : After completing this lesson, you should be able to: Describe key data collection methods Know key definitions: Population vs. Sample.
Sampling: Design and Procedures
statistics Specific number
Sampling: Design and Procedures
Statistics and Research Desgin
statistics Specific number
Chapter 1 The Where, Why, and How of Data Collection
Chapter 1 The Where, Why, and How of Data Collection
Presentation transcript:

Prepared by Lloyd R. Jaisingh A PowerPoint Presentation Package to Accompany Applied Statistics in Business & Economics, 4th edition David P. Doane and Lori E. Seward Prepared by Lloyd R. Jaisingh

Data Collection Chapter 2 Chapter Contents 2.1 Definitions 2.2 Level of Measurement 2.3 Sampling Concepts 2.4 Sampling Methods 2.5 Data Sources 2.6 Surveys

Data Collection Chapter Learning Objectives Chapter 2 LO2-1: Use basic terminology for describing data and samples. LO2-2: Explain the distinction between numerical and categorical data. LO2-3: Explain the difference between time series and cross- sectional data. LO2-4: Recognize levels of measurement in data and ways of coding data. LO2-5: Recognize a Likert scale and know how to use it.

Data Collection Chapter Learning Objectives Chapter 2 LO2-6: Use the correct terminology for samples and populations. LO2-7: Explain the common sampling methods and how to implement them. LO2-8: Find everyday print or electronic data sources. LO2-9: Describe basic elements of survey design, survey types, and sources of error.

2.1 Definitions Chapter 2 LO2-1 LO2-1: Use basic terminology for describing data and samples. Observations, Variables, Data Sets Observation: a single member of a collection of items that we want to study, such as a person, firm, or region. Variable: a characteristic of the subject or individual, such as an employee’s income or an invoice amount Data Set: consists of all the values of all of the variables for all of the observations we have chosen to observe. 2-5 5

Table 2.2: Number of Variables and Typical Tasks 2.1 Definitions Chapter 2 Table 2.2: Number of Variables and Typical Tasks Data Set Variables Example Typical Tasks Univariate One Income Histograms, descriptive statistics, frequency tallies Bivariate Two Income, Age Scatter plots, correlations, regression modeling Multivariate More than two Income, Age, Gender Multiple regression, data mining, econometric modeling 2-6 6

Data Types LO2-2 Chapter 2 LO2-2: Explain the distinction between numerical and categorical data. Note: Ambiguity is introduced when continuous data are rounded to whole numbers. Be cautious. (Figure 2.1) 2-7 7

Time Series versus Cross-Sectional Data LO2-3 Chapter 2 LO2-3: Explain the difference between time series and cross-sectional data. Time Series Data Each observation in the sample represents a different equally spaced point in time (e.g., years, months, days). Periodicity may be annual, quarterly, monthly, weekly, daily, hourly, etc. We are interested in trends and patterns over time (e.g., personal bankruptcies from 1980 to 2008). 2-8 8

Time Series Versus Cross-Sectional Data LO2-3 Chapter 2 Cross Sectional Data Each observation represents a different individual unit (e.g., person) at the same point in time (e.g., monthly VISA balances). We are interested in: - variation among observations or - relationships. We can combine the two data types to get pooled cross-sectional and time series data. 2-9 9

2.2 Level of Measurement Chapter 2 LO2-4 LO2-4: Recognize levels of measurement in data and ways of coding data. 2-10 10

2.2 Level of Measurement Chapter 2 LO2-4 LO2-4: Recognize levels of measurement in data and ways of coding data. Levels of Measurement Level of Measurement Characteristics Example Nominal Categories only Eye color (blue, brown, green, etc.) Ordinal Rank has meaning. No clear meaning to distance Rarely, never Interval Distance has meaning Temperature (57o Celsius) Ratio Meaningful zero exists Accounts payable ($21.7 million) 2-11 11

2.2 Level of Measurement Chapter 2 LO2-4 Nominal Measurement Nominal data merely identify a category. Nominal data are qualitative, attribute, categorical or classification data and can be coded numerically (e.g., 1 = Apple, 2 = Compaq, 3 = Dell, 4 = HP). Only mathematical operations are counting (e.g., frequencies) and simple statistics. Ordinal Measurement Ordinal data codes can be ranked (e.g., 1 = Frequently, 2 = Sometimes, 3 = Rarely, 4 = Never). 2-12 12

2.2 Level of Measurement Chapter 2 LO2-4 Ordinal Measurement Distance between codes is not meaningful (e.g., distance between 1 and 2, or between 2 and 3, or between 3 and 4 lacks meaning). Many useful statistical tests exist for ordinal data. Especially useful in social science, marketing and human resource research. Interval Measurement Data can not only be ranked, but also have meaningful intervals between scale points (e.g., difference between 60F and 70F is same as difference between 20F and 30F). 2-13 13

2.2 Level of Measurement Chapter 2 LO2-4 Interval Measurement Since intervals between numbers represent distances, mathematical operations can be performed (e.g., average). Zero point of interval scales is arbitrary, so ratios are not meaningful (e.g., 60F is not twice as warm as 30F). Ratio Measurement Ratio data have all properties of nominal, ordinal and interval data types and also possess a meaningful zero (absence of quantity being measured). 2-14 14

2.2 Level of Measurement Chapter 2 LO2-4 Ratio Measurement Because of this zero point, ratios of data values are meaningful (e.g., $20 million profit is twice as much as $10 million). Zero does not have to be observable in the data; it is an absolute reference point. 2-15 15

2.2 Level of Measurement Chapter 2 LO2-5 Likert Scales A special case of interval data frequently used in survey research. The coarseness of a Likert scale refers to the number of scale points (typically 5 or 7). LO2-5: Recognize a Likert scale and know how to use it. 2-16 16

2.2 Level of Measurement Likert Scales (examples) Chapter 2 LO2-5 “College-bound high school students should be required to study a foreign language.” (check one)   Strongly Agree Somewhat Neither Agree Nor Disagree Disagree How would you rate your marketing instructor? (check one)  Terrible  Poor  Adequate  Good  Excellent 2-17 17

2.2 Level of Measurement LO2-4 Chapter 2 Use the following procedure to recognize data types: Question If “Yes” Q1. Is there a meaningful zero point? Ratio data (statistical operations are allowed) Q2. Are intervals between scale points meaningful? Interval data (common statistics allowed, e.g., means and standard deviations) Q3. Do scale points represent rankings? Ordinal data (restricted to certain types of nonparametric statistical tests) Q4. Are there discrete categories? Nominal data (only counting allowed, e.g., finding the mode) 2-18 18

2.2 Level of Measurement Chapter 2 LO2-4 Changing Data By Recoding In order to simplify data or when exact data magnitude is of little interest, ratio data can be recoded downward into ordinal or nominal measurements (but not conversely). For example, recode systolic blood pressure as “normal” (under 130), “elevated” (130 to 140), or “high” (over 140). The above recoded data are ordinal (ranking is preserved), but intervals are unequal and some information is lost. 2-19 19

2.3 Sampling Concepts Chapter 2 LO2-6 LO2-6: Use the correct terminology for samples and populations Sample or Census A sample involves looking only at some items selected from the population. A census is an examination of all items in a defined population. Why can’t the United States Census survey every person in the population? – mobility, un-documented workers, budget constraints, incomplete responses, etc. 2-20 20

2.3 Sampling Concepts Chapter 2 LO2-6 Situations Where A Sample or Census May Be Preferred Sample Census Infinite population Small population Destructive testing Large sample size Timely results Database exists Accuracy Legal requirements Cost Sensitive information 2-21 21

2.3 Sampling Concepts Parameters and Statistics Chapter 2 LO2-6 Statistics are computed from a sample of n items, chosen from a population of N items. Statistics can be used as estimates of parameters found in the population. Symbols are used to represent population parameters and sample statistics. 2-22 22

2.3 Sampling Concepts Chapter 2 LO2-6 Rule of Thumb: A population may be treated as infinite when N is at least 20 times n (i.e., when N/n ≥ 20). 2-23 23

2.3 Sampling Concepts Target Population Chapter 2 LO2-6 The population must be carefully specified and the sample must be drawn scientifically so that the sample is representative. The target population is the population we are interested in (e.g., U.S. gasoline prices). The sampling frame is the group from which we take the sample (e.g., 115,000 stations). The frame should not differ from the target population. 2-24 24

2.4 Sampling Methods Chapter 2 LO2-7 LO2-7: Explain the common sampling methods and how to implement them Random Sampling Simple random sample Use random numbers to select items from a list (e.g., VISA cardholders). Systematic sample Select every kth item from a list or sequence (e.g., restaurant customers). Stratified sample Select randomly within defined strata (e.g., by age, occupation, gender). Cluster sample Like stratified sampling except strata are geographical areas (e.g., zip codes). 2-25 25

2.4 Sampling Methods Chapter 2 LO2-7 Non-random Sampling Judgment sample Use expert knowledge to choose “typical” items (e.g., which employees to interview). Convenience sample Use a sample that happens to be available (e.g., ask co-worker opinions at lunch). Focus groups In-depth dialog with a representative panel of individuals (e.g., iPod users). 2-26 26

2.4 Sampling Methods Chapter 2 LO2-7 With or Without Replacement If we allow duplicates when sampling, then we are sampling with replacement. Duplicates are unlikely when n is much smaller than large N. If we do not allow duplicates when sampling, then we are sampling without replacement. 2-27 27

2.4 Sampling Methods Computer Methods Chapter 2 LO2-7 Excel - Option A Enter the Excel function =RANDBETWEEN(1,875) into 10 spreadsheet cells. Press F9 to get a new sample. Excel - Option B Enter the function =INT(1+875*RAND()) into 10 spreadsheet cells. Press F9 to get a new sample. Internet The website www.random.org will give you many kinds of excellent random numbers (integers, decimals, etc). Minitab Use Minitab’s Random Data menu with the Integer option. These are pseudo-random generators because even the best algorithms eventually repeat themselves. 2-28 28

2.4 Sampling Methods Chapter 2 LO2-7 Row – Column Data Arrays When the data are arranged in a rectangular array, an item can be chosen at random by selecting a row and column. For example, in the 4 x 3 array, select a random column between 1 and 3 and a random row between 1 and 4. This way, each item has an equal chance of being selected. 2-29 29

2.4 Sampling Methods Randomizing a List Chapter 2 LO2-7 In Excel, use function =RAND() beside each row to create a column of random numbers between 0 and 1. Copy and paste these numbers into the same column using Paste Special > Values in order to paste only the values and not the formulas. Sort the spreadsheet on the random number column. 2-30 30

Note that N/n = 78/20  4 (periodicity). 2.4 Sampling Methods LO2-7 Chapter 2 Systematic Sampling Sample by choosing every kth item from a list, starting from a randomly chosen entry on the list. For example, starting at item 2, we sample every 4 items to obtain a sample of n = 20 items from a list of N = 78 items. Note that N/n = 78/20  4 (periodicity). 2-31 31

2.4 Sampling Methods Stratified Sampling Chapter 2 LO2-7 Utilizes prior information about the population. Applicable when the population can be divided into relatively homogeneous subgroups of known size (strata). A simple random sample of the desired size is taken within each stratum. For example, from a population containing 55% males and 45% females, randomly sample from 110 males and 90 females (n = 200). 2-32 32

2.4 Sampling Methods Cluster Sample Chapter 2 LO2-7 Strata consist of geographical regions. One-stage cluster sampling – sample consists of all elements in each of k randomly chosen subregions (clusters). Two-stage cluster sampling, first choose k subregions (clusters), then choose a random sample of elements within each cluster. 2-33 33

2.4 Sampling Methods Cluster Sample Chapter 2 LO2-7 Here is an example of 4 elements sampled from each of 3 randomly chosen clusters (two-stage cluster sampling). 2-34 34

2.4 Sampling Methods Judgment Sample Chapter 2 LO2-7 A non-probability sampling method that relies on the expertise of the sampler to choose items that are representative of the population. Can be affected by subconscious bias (i.e., non-randomness in the choice). Quota sampling is a special kind of judgment sampling, in which the interviewer chooses a certain number of people in each category. 2-35 35

2.4 Sampling Methods Focus Groups Chapter 2 LO2-7 Convenience Sample Take advantage of whatever sample is available at that moment. A quick way to sample. Focus Groups A panel of individuals chosen to be representative of a wider population, formed for open-ended discussion and idea gathering. 2-36 36

2.5 Data Sources Chapter 2 LO2-8 LO2-8: Find everyday print or electronic data sources. One goal of a statistics course is to help you learn where to find data that might be needed. Fortunately, many excellent sources are widely available. Some sources are given in the following table. 2-37 37

2.6 Surveys Basic Steps of Survey Research Chapter 2 LO2-9 LO2-9: Describe basic elements of survey design, survey types, and sources of error. Basic Steps of Survey Research Step 1: State the goals of the research. Step 2: Develop the budget (time, money, staff). Step 3: Create a research design (target population, frame, sample size). Step 4: Choose a survey type and method of administration. 2-38 38

2.6 Surveys Basic Steps of Survey Research Chapter 2 LO2-9 Step 5: Design a data collection instrument (questionnaire). Step 6: Pretest the survey instrument and revise as needed. Step 7: Administer the survey (follow up if needed). Step 8: Code the data and analyze it. 2-39 39

2.6 Surveys Survey Types Survey Guidelines Chapter 2 LO2-9 Planning Mail Telephone Interviews Web Direct observation Planning Design Quality Pilot test Buy-in Expertise 2-40 40

2.6 Surveys Chapter 2 LO2-9 Questionnaire Design Use a lot of white space in layout. Begin with short, clear instructions. State the survey purpose. Assure anonymity. Instruct on how to submit the completed survey. Break survey into naturally occurring sections. Let respondents bypass sections that are not applicable (e.g., “if you answered no to question 7, skip directly to Question 15”). 2-41 41

2.6 Surveys Questionnaire Design Types of Questions Chapter 2 LO2-9 Pretest and revise as needed. Keep as short as possible. Types of Questions Open-ended Fill-in-the-blank Check boxes Ranked choices Pictograms Likert scale 2-42 42

2.6 Surveys Question Wording Chapter 2 LO2-9 The way a question is asked has a profound influence on the response. For example, Shall state taxes be cut? Shall state taxes be cut, if it means reducing highway maintenance? Shall state taxes be cut, if it means firing teachers and police? 2-43 43

Question Wording 2.6 Surveys Are you married?  Yes  No Chapter 2 LO2-9 Chapter 2 Question Wording Make sure you have covered all the possibilities. For example, Are you married?  Yes  No Overlapping classes or unclear categories are a problem. What if your father is deceased or is 45 years old. How old is your father?  35 – 45  45 – 55  55 – 65  65 or older 2-44 44

2.6 Surveys Coding and Data Screening Chapter 2 LO2-9 Responses are usually coded numerically (e.g., 1 = male, 2 = female). Missing values are typically denoted by special characters (e.g., blank, “.” or “*”). Discard questionnaires that are flawed or missing many responses. Watch for multiple responses, outrageous or inconsistent replies or out-of-range answers. Followup if necessary and always document your data-coding decisions. 2-45 45