QNT 531 Advanced Problems in Statistics and Research Methods

Name: QNT 531 Advanced Problems in Statistics and Research Methods
Uploaded: 2018-01-11T03:45:36+00:00
Duration: PTM17S4
Channel: Theresa Osborne
Description: QNT 531 Advanced Problems in Statistics and Research Methods

QNT 531 Advanced Problems in Statistics and Research Methods
WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX Dr. Serhat Eren University of Phoenix

ANALYSIS OF VARIANCE AND EXPERIMENTAL DESIGN
SECTION 2 ANALYSIS OF VARIANCE AND EXPERIMENTAL DESIGN

SECTION 2 SECTION OBJECTIVES
An Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality of k population means Multiple comparison procedures An introduction to Experimental Design Completely Randomized Designs Randomized Block Design Factorial Experiment

SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
One-Way Designs: The Basics A factor is a variable that can be used to differentiate one group or population from another. It is a variable that may be related to he variable of interest. A level is one of several possible values or settings that the factor can assume. The response variable is a quantitative variable that you are measuring or observing.

These are all examples of one-way or completely randomized designs. An experiment has a one-way or completely randomized design if there are several different levels of one factor being studied and the objects or people being observed/ measured are randomly assigned to one of the levels of the factor.

The term one-way refers to the fact that the groups differ with regard to the one factor being studied. The term completely randomized refers to the fact that individual observations are assigned to the groups in a random manner.

Understanding the Total Variation Analysis of variance (ANOYA) is the technique used to analyze the variation in the data to determine if more than two population means are equal. A treatment is a particular setting or combination of settings of the factor(s)

The grand mean or the overall mean is the sample average of all the observations in the experiment. It is labeled (x-bar-bar). Now we can rewrite the variance calculations as follows:

The total variation or sum of squares total (SST) is a measure of the variability in the entire data set considered as a whole. SST is calculated as follows:

Components of Total Variation The between groups variation is also called the Sum or Squares between or the Sum of Squares Among and it measures how much of the total variation comes from actual differences in the treatments. The dot-plot shown in Figure 14.3 displays the sample average for each of the four time treatments. These are called treatment means.

A treatment mean is the average of the response variable for a particular treatment. Between Groups Variation measures how different the individual treatment means are from the overall grand mean. It is often called the sum of squares between or the sum of squares among (SSA).

The formula for sum of squares among (SSA) is: Within groups variation measures the variability in the measurements within the groups. It is often called sum of squares within or sum of squares error (SSE).

The Mean Square Terms in the ANOVA Table The mean square among is labeled MSA The mean square error is labeled MSE and the mean square total is labeled MST. The formulas for the mean squares are;

Testing the Hypothesis of Equal Means In general, the null and alternative hypotheses for a one-way designed experiment are shown below: HA: At least one of the population means is different from the others.

The formula for the F test statistic is calculated by taking the ratio of the two sample variances: In ANOVA, MSA and MSE are our two sample variances. So the F statistic is calculated as:

SECTION 2 ASSUMPTIONS OF ANOVA
The three major assumptions of ANOVA are as follows: The errors are random and independent of each other. Each population has a normal distribution. All of the populations have the same variance.

SECTION 2 ANALYSIS OF DATA FROM BLOCKED DESIGNS
A block is a group or objects or people that have been matched. Are object or person can be matched with itself, meaning that repeated observations are taken on that object or person and these observations form a block? If the realities of data collection lead you to use blocks, then you must take this into account in your analysis. Your experimental design is called a randomized block design. Instead of using a one-way ANOVA you must use a block ANOVA.

An experiment has a randomized block design if several different levels of one factor are being studied and the objects or people being observed/ measured have been matched. Each object or person is randomly assigned to one of the c levels of the factor.

Partitioning the Total Variation Like the approach we took with data from a one-way design, the idea is to take the total variability as measured by SST and break it down into its components. With a block design there is one additional component: the variability between the blocks. It is called the sum of squares blocks and is labeled SSBL.

The sum of squares blocks measures the variability between the blocks. It is labeled SSBL. For a block design, the variation we see in the data is due to one of three things: the level of the factor, the block, or the error.

Thus, the total variation is divided into three components: SST = SSA + SSBL + SSE

Using the ANOVA Table in a Block Design The ANOVA table for such a block design looks just like the ANOVA table for a one-way design with an additional row.

SECTION 2 ANALYSIS OF DATA FROM TWO-WAY DESIGNS
Motivation for a Factorial Design Model An experimental design is called a factorial design with two factors if there are several different levels of two factors being studied. The first factor is called factor A and there are r levels of factor A. The second factor is called factor B and there are c levels of factor B.

The design is said to have equal replication if the same number of objects or people being observed/measured are randomly selected from each population. The population is described by a specific level for each of the two factors. Each observation is called a replicate. There are n' observations or replicates observed from each population. There are n = n'rc observations in total.

Partitioning the Variation The sum of squares due to factor A is labeled SSA. It measures the squared differences between the mean of each level of factor A and the grand mean. The sum of squares due to factor B is labeled SSB. It measures the squared differences between the mean of each level of factor B and the grand mean.

The sum of squares due to the interacting effect of A and B is labeled SSAB. It measures the effect of combining factor A and factor B. The sum of squares error is labeled SSE. It measures the variability in the measurements within the groups. Thus, the total variation is divided into four components: SST = SSA + SSB + SSAB + SSE

Using the ANOVA Table in a Two-Way Design The ANOVA table for such a design looks just like the ANOVA table for a one-way design with two additional rows.

Using the ANOVA Table in a Two-Way Design In a two-way ANOVA, three hypothesis tests should be done. To test the hypothesis of no difference due to factor A we would have the following null and alternative hypotheses: Ho: There is no difference in the population means due to factor A. HA: There is a difference in the population means due to factor A.

To test the hypothesis of no difference due to factor B we would have the following null and alternative hypotheses: Ho: There is no difference in the population means due to factor B. HA: There is a difference in the population means due to factor B.

To test the hypothesis of no difference due to the interaction of factors A and B we would have the following null and alternative hypotheses: Ho: There is no difference in the population means due to the interaction of factors A and B. HA: There is a difference in the population means due to the interaction of factors A arid B.

Understanding the interaction Effect The easiest way to understand this effect is to look at a graph of the sample averages for each of the possible combinations of the two factors. The line graph shown in Figure 14.7 displays the 20 sample means for airspace.

From this graph you can see that the mean airspace decreases the longer the box sits on the shelf, regardless of from what position in the hardroll the box was made. The airspace behavior is affected by the interaction of the time on the shelf and the position in the hardroll from which it was made.

If there were no interaction effect, the lines connecting the sample means would be parallel as in Figure 14.8.

SECTION 2 MULTIPLE COMPARISON PROCEDURE
When we use analysis of variance to test whether the means of k populations are equal, rejection of the null hypothesis allows us to conclude only that the population means are not all equal. In some cases we will want to go a step further and determine where the differences among means occur.

The purpose of this section is to introduce two multiple comparison procedures that can be used to conduct statistical comparisons between pairs of population means.

2.3.1 FISHER’S LSD Suppose that analysis of variance has provided statistical evidence to reject the null hypothesis of equal population means. In this case, Fisher’s least significant difference (LSD) procedure can be used to determine where the differences occur.

Confidence Interval Estimate of the Difference Between Two Population Means Using Fisher’s LSD Procedure:

TYPE I ERROR RATES We showed how Fisher’s LSD procedure can be used in such cases to determine where the differences occur. Technically, it is referred to as a protected or restricted LSD test because it is employed only if we first find a significant F value by using analysis of variance.

To see why this distinction is important in multiple comparison tests, we need to explain the difference between a comparisonwise Type I error rate and an experimentwise Type I error rate.

For example, in the NCP example Fisher’s LSD procedure was used to make three pairwise comparisons.

In each case, we used a level of significance of = 0.05. Therefore, for each test, if the null hypothesis is true, the probability that we will make a Type I error is = 0.05; hence, the probability that we will not make a Type I error on each test is = 0.95.

In discussing multiple comparison procedures we refer to this probability of a Type I error (= 0.05) as the comparisonwise Type I error rate; comparisonwise Type I error rates indicate the level of significance associated with a single pairwise comparison. Let us now consider a slightly different question. What is the probability that in making three pairwise comparisons, we will commit a Type I error on at least one of the three tests?

To answer this question, note that the probability that we will not make a Type I error on any of the three tests is; (.95)(.95)(.95)= Therefore, the probability of making at least one Type I error is: =

Thus, when we use Fisher’s LSD procedure to make all three pairwise comparisons, the Type I error rate associated with this approach is not .05, but actually ; we refer to this error rate as the overall or experimentwise Type I error rate.

QNT 531 Advanced Problems in Statistics and Research Methods

Similar presentations

Presentation on theme: "QNT 531 Advanced Problems in Statistics and Research Methods"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

QNT 531 Advanced Problems in Statistics and Research Methods

Similar presentations

Presentation on theme: "QNT 531 Advanced Problems in Statistics and Research Methods"— Presentation transcript:

Similar presentations

About project

Feedback