Download presentation
1
Chapter 1 The Where, Why, and How of Data Collection
Business Statistics: A Decision-Making Approach 7th Edition Chapter 1 The Where, Why, and How of Data Collection
2
What is Statistics? Statistics is the development and application of methods to collect, analyze and interpret data. Modern statistical methods involve the design and analysis of experiments and surveys, the quantification of biological, social and scientific phenomenon and the application of statistical principles to understand more about the world around us. Statistics is a discipline which is concerned with: designing experiments and other data collection, summarizing information to aid understanding, drawing conclusions from data, and estimating the present or predicting the future.
3
Population vs. Sample Population Sample a b c d b c ef gh i jk l m n
o p q rs t u v w x y z b c g i n o r u y
4
Populations and Samples
A Population is the set of all items or individuals of interest Examples: All likely voters in the next election All parts produced today All sales receipts for November A Sample is a subset of the population Examples: 1000 voters selected at random for interview A few parts selected for destructive testing Every 100th receipt selected for audit
5
Why Sample? Less time consuming than a census
Less costly to administer than a census It is possible to obtain statistical results of a sufficiently high precision based on samples.
6
Nonstatistical Sampling
Sampling Techniques Sampling Techniques Nonstatistical Sampling Statistical Sampling Simple Random Convenience Systematic Judgment Cluster Stratified Not interested in……
7
(Probability Sampling)
Statistical Sampling Items of the sample are chosen based on known or calculable probabilities Statistical Sampling (Probability Sampling) Simple Random Stratified Systematic Cluster Video Clip Please read the book
8
Example of Random Sampling
Suggesting how the statistical sampling techniques can be used to gather data on employees' preferences for scheduling vacation times. Simple random sampling could be used by assigning each employee a number and then using a random number generator to select employees. Table and Excel Toolpak
9
Simple Random Sampling
Every possible sample of a given size has an equal chance of being selected Selection may be with replacement or without replacement The sample can be obtained using a table of random numbers or computer random number generator
10
Type of Statistics Descriptive statistics Inferential statistics
Mathematical methods (such as mean, median, standard deviation) that summarize and interpret some of the properties of a set of data (sample) but do not infer the properties of the population from which the sample was drawn. Inferential statistics Mathematical methods (such as hypothesis development) that employ probability theory for deducing (inferring) the properties of a population from the analysis of the properties of a set of data (sample) drawn from it. It is concerned also with the precision and reliability of the inferences it helps draw.
11
Descriptive Statistics
Collect data e.g., Survey, Observation, Experiments Present data e.g., Charts and graphs Characterize data e.g., Sample mean =
12
Inferential Statistics
Making statements about a population by examining sample results Sample statistics Population parameters (known) Inference (unknown, but can be estimated from sample evidence)
13
Inferential Statistics
Drawing conclusions and/or making decisions concerning a population based on sample results. Estimation e.g., Estimate the population mean weight using the sample mean weight Hypothesis Testing e.g., Use sample evidence to test the claim that the population mean weight is 120 pounds
14
Tools for Collecting Data
Data Collection Methods Experiments Written questionnaires Telephone surveys Direct observation and personal interview
15
Survey Design Steps Define the issue Define the population of interest
what are the purpose and objectives of the survey? Define the population of interest Develop survey questions make questions clear and unambiguous use universally-accepted definitions limit the number of questions
16
Survey Design Steps Pre-test the survey
(continued) Pre-test the survey pilot test with a small group of participants assess clarity and length Determine the sample size and sampling method Select sample and administer the survey
17
Types of Questions Closed-end Questions
Select from a short list of defined choices Example: Major: __business __liberal arts __science __other Open-end Questions Respondents are free to respond with any value, words, or statement Example: What did you like best about this course? Demographic Questions Questions about the respondents’ personal characteristics Example: Gender: __Female __ Male
18
Data (variable) Types Data Qualitative (Categorical) Quantitative
(Numerical) Discrete Continuous Examples: Marital Status Political Party Eye Color (Defined categories) Examples: Number of Children Defects per hour (Counted items) Examples: Weight Voltage (Measured characteristics)
19
Qualitative vs. Quantitative Variables (Data)
Qualitative variables (data) take on values that are names or labels. Example: the color of a ball (e.g., red, green, blue) or the breed of a dog (e.g., collie, shepherd, terrier) Quantitative variables are numerical. They represent a measurable quantity. Example: # of students in CSUB or # of people in Bakersfield
20
Discrete vs. Continuous Variables (Data)
Quantitative variables can be further classified as discrete or continuous. If a variable can take on any value between its minimum value and its maximum value, it is called a continuous variable; otherwise, it is called a discrete variable. Example: The fire department mandates that all fire fighters must weigh between 150 and 250 pounds. The weight of a fire fighter would be an example of a continuous variable; since a fire fighter's weight could take on any value between 150 and 250 pounds.
21
Discrete vs. Continuous Variables (Data)
Example: If we flip a coin and count the number of heads. The number of heads could be any integer value between 0 and plus infinity. However, it could not be any number between 0 and plus infinity. That is, we could not, for example, get 2.3 heads. Therefore, the number of heads must be a discrete variable.
22
Data Measurement Levels
Highest Level Complete Analysis Measurements Ratio/Interval Data Rankings Ordered Categories Higher Level Mid-level Analysis Ordinal Data Categorical Codes ID Numbers Category Names Lowest Level Basic Analysis Nominal Data
23
Nominal Nominal basically refers to categorically discrete data such as name of your school, type of car you drive or name of a book. This one is easy to remember because nominal sounds like name (they have the same Latin root).
24
Ordinal Ordinal refers to quantities that have a natural ordering. The ranking of favorite sports, the order of people's place in a line, the order of runners finishing a race or more often the choice on a rating scale from 1 to 5. With ordinal data you cannot state with certainty whether the intervals between each value are equal. For example, we often using rating scales (Likert-Scale questions). On a 10 point scale, the difference between a 9 and a 10 is not necessarily the same difference as the difference between a 6 and a 7. This is also an easy one to remember, ordinal sounds like order.
25
Interval Interval data is like ordinal except we can say the intervals between each value are equally split. The most common example is temperature in degrees Fahrenheit. The difference between 29 and 30 degrees is the same magnitude as the difference between 78 and 79 (although I know I prefer the latter). With attitudinal scales and the Likert questions you usually see on a survey, these are rarely interval, although many points on the scale likely are of equal intervals.
26
Ratio Ratio data is interval data with a natural zero point. For example, time is ratio since 0 time is meaningful. Degrees Kelvin has a 0 point (absolute 0) and the steps in both these scales have the same degree of magnitude.
27
Data Types Time Series Data Cross Section Data
Ordered data values observed over time Cross Section Data Data values observed at a fixed point in time
28
Data Types Sales (in $1000’s) 2003 2004 2005 2006 Atlanta 435 460 475
490 Boston 320 345 375 395 Cleveland 405 390 410 Denver 260 270 285 280 Time Series Data Cross Section Data
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.