Chapter 4 Statistics. 4.1 – What is Statistics? Definition 4.1.1 Data are observed values of random variables. The field of statistics is a collection.

Chapter 4 Statistics

4.1 – What is Statistics? Definition 4.1.1 Data are observed values of random variables. The field of statistics is a collection of methods for estimating distributions and parameters of random variables through the collection and analysis of data.

4.1 – What is Statistics? Definition 4.1.2 The population is the set of all objects of interest in a statistical study. A sample is a subset of the population. Definition 4.1.3 Data are information that has been collected. The field of statistics is a collection of methods for drawing conclusions about a population by collecting and anlyzing data from a sample.

Types of Data Definition 4.1.4 A parameter is a number calculated using information from every member of a population. A statistic is calculated using information from a sample. Definition 4.1.5 Quantitative data consist of numbers. Qualitative data are nonnumeric information that can be separated into different categories.

Types of Data Definition 4.1.6 Discrete data are observed values of a discrete random variable. They are numbers that have a finite or countable set of values. Continuous data are observed values of a continuous random variable. They are numbers that can take any value within some range.

Levels of Measurement Definition 4.1.7 – Data are at the nominal level of measurement if they consist of only names, labels, or categories. They cannot be ordered (such as smallest to largest) in a meaningful way. – Data are at the ordinal level of measurement if they can be ordered in a meaningful way, but differences between data values cannot be calculated or are meaningless. – Data are at the interval level of measurement if they can be ordered in a meaningful way and differences between data values are meaningful. – Data are at the ratio level of measurement if they are at the interval level, ratios of data values are meaningful, and there is meaningful zero starting point.

Types of Studies Definition 4.1.8 – In an observational study, data is obtained in a way such that the members of the sample are not changed, modified, or altered in any way. – In an experiment, something is done to the members of the sample and the resulting effects are recorded. The “something” that is done is called a treatment.

Types of Observational Studies Definition 4.1.9 – In a cross-sectional study, data are collected at one specific point in time. – In a retrospective study, data are collected from studies done in the past. – In a prospective study, data are collected by observing a sample for some time into the future.

Blocks Definition 4.1.10 A block is a subset of the population with a similar characteristic. Different blocks of a population have different characteristics that may affect the variable of interest differently. A randomized block design is a type of experiment where: 1.The population is divided into blocks. 2.Members from each block are randomly chosen to receive the treatment.

Sampling Techniques Definition 4.1.11 – A convenience sample is a sample that is very easy to get. – A voluntary response sample is obtained when members of the sample decide whether to participate or not. – A systematic sample is obtained by arranging the population in some order, then selecting a starting point, and then selecting every k th member (such as every 20 th ).

Sampling Techniques – A cluster sample is obtained by dividing the population into subsets (or clusters) where the members of each cluster have a common characteristic, then randomly choosing some of the clusters, and surveying every member of the chosen clusters. – A stratified sample is obtained by dividing the population into subsets and then randomly choosing some members from each of the subsets. – A multistage sample is obtained by successively applying a variety of sampling techniques. At each stage the sample becomes smaller, and at the last stage, a clustersample is chosen.

Random Samples Definition 4.1.12 – A random sample is chosen in a way such that every individual member of the population has the same probability of being chosen. – A simple random sample of size n is chosen in a way such that every group of size n has the same probability of being chosen.

4.2 – Summarizing Data Example 4.2.3 Shown below are the waiting times of 30 customers at a supermarket check- out stand Relative frequency distribution

Histograms The “shape” of a relative frequency histogram is an approximation of the graph of the p.d.f. (or p.m.f.) of the underlying random variable.

Summary Statistics Definition 4.2.1 Let {x 1, x 2,…, x n } be a set of quantitative data collected from a sample of the population 1. mean of the data: 2. variance of the data: 3. standard deviation of the data: 4. range of the data: (max value) – (min value)

Example 4.2.4

Percentiles Definition 4.2.2 Let p be a number between 0 and 1. The (100p) th percentile of a set of quantitative data is a number, denoted π p, that is greater than (100p)% of the data values. – The 25 th, 50 th, and 75 th percentiles are called the first, second and third quartiles and are denoted p 1 = π 0.25, p 2 = π 0.50, and p 3 = π 0.75, respectively. – The 50th percentile is also called the median of the data and is denoted m = p 2. – The mode of the data is the data value that occurs most frequently. – The 5-number summary of a set of data consists of the minimum value, p 1, p 2, p 3, and the maximum value.

Calculating Percentiles

Example 4.2.5 Calculate the first quartile, p 1 = π 0.25 Calculate the median m = p 2 = π 0.5

Example 4.2.5 5-number summary 0, 0.5, 1.8, 2.9, 7.3 Box Plot

4.4 – Sampling Distributions

Sample Proportion

Sampling Distribution of the Proportion

Example 4.4.3 By examining the spending habits of one particular consumer, a credit card company observes that during the course of normal transactions 37% of the charges exceed $150. Out of 50 charges made in one particular month, 27 exceeded $150. Does it appear that these charges were made in the course of normal transactions?

Example 4.4.3

Sample Mean

Sampling Distribution of the Mean

4.5 – Confidence Intervals for a Proportion

Critical Values

Confidence Interval

Different forms

Requirements 1.The sample must be random. 2.The conditions for a binomial distribution must be satisfied (at least approximately). 3.There are at least 5 successes and at least 5 failures observed in the n trials.

Example 4.5.2 Suppose 383 out of 735 surveyed voters support a particular political candidate. Calculate a 95% confidence interval estimate for the proportion of all voters who support the candidate. 1.Define the population proportion being estimated: p = The proportion of all voters who support the candidate 2.Calculate the sample proportion

Example 4.5.2

Correct interpretation – We are 95% confident that the value of p is between 0.485 and 0.557. Meaning – If we were to survey many different samples of voters and calculate the corresponding 95% confidence interval using the statistics from each sample, then about 95% of the intervals would contain the true value of p.

4.6 – Confidence Intervals for a Mean

Z-Interval

T-Interval

Requirements 1.The sample is random. 2.The population is normally distributed or n > 30.

Which Type of Interval?

Example 4.6.3

2.Find the critical value: α = 0.01 and n = 15 3.Calculate the margin of error: 4.Calculate the confidence interval:

4.7 – Confidence Intervals for a Variance

Confidence Intervals for a Variance Requirements 1.The sample is random. 2.The population is normally distributed.

Example 4.7.2

4.8 – Confidence Intervals for Differences

2-Proportion Z-Interval Requirements 1.Both samples are random and independent. 2.Each sample contains at least 5 successes and 5 failures.

2-Sample T-Interval

Equal Variances

Non-equal Variances

Requirements 1.Both samples are random and independent. 2.Both populations are normally distributed or both sample sizes are greater than 30.

4.9 – Sample Size

4.10 – Assessing Normality

Normal Quantile Plot

Example 4.10.2 The second row of the table below gives the average daily temperatures in the month of November for the city of Lincoln, NE for nine different years (data collected by Brandon Metcalf, 2009). Determine if the population of all such temperatures is normally distributed.

Example 4.10.2 Roughly a straight line – Population is normal

Straight Line

Fuzzy Central Limit Theorem If the population is influenced by many small, random, unrelated effects, then the population may be normally distributed.

Chapter 4 Statistics. 4.1 – What is Statistics? Definition 4.1.1 Data are observed values of random variables. The field of statistics is a collection.

Similar presentations

Presentation on theme: "Chapter 4 Statistics. 4.1 – What is Statistics? Definition 4.1.1 Data are observed values of random variables. The field of statistics is a collection."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 4 Statistics. 4.1 – What is Statistics? Definition 4.1.1 Data are observed values of random variables. The field of statistics is a collection.

Similar presentations

Presentation on theme: "Chapter 4 Statistics. 4.1 – What is Statistics? Definition 4.1.1 Data are observed values of random variables. The field of statistics is a collection."— Presentation transcript:

Similar presentations

About project

Feedback