Analyze Phase “X” Sifting

Analyze Phase “X” Sifting
Now we will continue in the Analyze Phase with “X Sifting” – determining what the impact of the inputs to our process are.

“X” Sifting Welcome to Analyze “X” Sifting Inferential Statistics
Multi-Vari Analysis Classes and Causes Hypothesis Testing NND P1 Hypothesis Testing ND P1 Intro to Hypothesis Testing Inferential Statistics “X” Sifting Welcome to Analyze Hypothesis Testing ND P2 Wrap Up & Action Items Hypothesis Testing NND P2 The core fundamentals of this phase are Multi-Vari Analysis and Classes and Causes.

Multi-Vari Studies In the Define Phase we used Process Mapping to identify all the possible “X’s” on the horizon. In the Measure Phase we used the X-Y Matrix, FMEA and Process Map to narrow our investigation to the probable “X’s”. X The quantity of Xs when we apply leverage (The vital few) The quantity of X’s remaining after DMAIC The many Xs when we first start (The trivial many) The many X’s after we think about Y= f(X ) + e keep reducing as you work the project In the Define Phase you use tools like Process Mapping to identify all possible “X’s”. In the Measure Phase you use tools to help refine all possible “X’s” like the X-Y Matrix and FMEA. In the Analyze Phase we start to “dis-assemble” the data to determine what it tells us. This is the fun part.

Multi-Vari Definition
Multi-Vari Studies – a tool that graphically displays patterns of variation. Multi-Vari Studies are used to identify possible X’s or families of variation. These families of variation can hide within a subgroup, between subgroups, or over time. The Multi-Vari Chart helps in screening factors by using graphical techniques to logically subgroup discrete X’s (Independent Variables) plotted against a continuous Y (Dependent). By looking at the pattern of the graphed points, conclusions are drawn from about the largest family of variation. Multi-Vari Chart can also be used to assess capability, stability and graphical relationships between X’s and Y’s. Please read the slide.

A picture can be worth a thousand words… or numbers.
Purpose The use of a Multi-Vari Chart illustrates analysis of variance data graphically. A picture can be worth a thousand words… or numbers. Multi-Vari Charts are useful in visualizing two-way interactions. Multi-Vari Charts reveal information such as: Effect of work shift on Y’s. Impact of specific machinery or material on Y’s. Effect of noise factors on Y’s, etc. At this point in DMAIC Multi-Vari Charts are intended to be used as a passive study but later in the process they can be used as a graphical representation where factors were intentionally changed. The only caveat with using MINITABTM to graph the data is the data must be balanced. Each source of variation must have the same number of data points across time.

Multi-Vari Example To put Multi-Vari studies in practice follow an example of an injection molding process. You are probably asking yourself what is Injection Molding? Well basically an injection molding machine takes hard plastic pellets and melts them into a fluid. This fluid is then injected into a mold or die, under pressure, to create products such as piping and computer cases.

Method Sampling Plans should encompass all three types of variation: Within, Between and Temporal. 1. Create Sampling Plan 2. Gather Passive Date 3. Graph Data 4. Check to see if Variation is Exposed 5. Interpret Results Gather Passive Data Graph Is Variation Exposed Interpret Results Create Sampling Plan No Yes Sampling Plans should encompass all three types of variation: Within, Between and Temporal. Typically we start with a data collection sheet that makes sense based on our knowledge of the process. Then follow the steps. If we only see minor variation in the sample it is time to go back to collect additional data. When your data collection represents at least 80% of the variation within the process then you should have enough information to evaluate the graph. Remember for a Multi-Vari Analysis to work the output must be Continuous and the sources of variation Discrete.

Within Unit or Positional
Sources of Variation Within Unit or Positional Within piece variation related to the geometry of the part. Variation across a single unit containing many individual parts; such as a wafer containing many computer processors. Location in a batch process such as plating. Between Unit or Cyclical Variation among consecutive pieces. Variation among groups of pieces. Variation among consecutive batches. Temporal or over time Shift-to-Shift Day-to-Day Week-to-Week Within Unit, Between Unit and Temporal are the classic causes of variation. A unit can be a single piece or a grouping of pieces depending on whether they were created at unique times. Multi-Vari Analysis can be performed on other processes, simply identify the categorical sources of variation you are interested in.

Machine Layout & Variables
Die Release Ambient Temp Injection Pressure Per Cavity Master Injection Pressure Fluid Level % Oxygen #2 #3 #4 #1 Distance to Tank In this example there are four widgets created with each die cycle. Therefore a unit is four widgets created at that unique time. An example of Within Unit Variation is measured by differences in the four widgets from a single die cycle. For example we could measure the wall thickness for each of the four widgets. Between Unit Variation is measured by differences from sequential die cycles. An example of Between Unit Variation is comparing the average of wall thickness from die cycle to die cycle. Temporal Variation is measured over some meaningful time period. For example we would compare the average of all the data collected in a time period; say the 8 o’clock hour to the 10 o’clock hour.

Monday Wednesday Friday
Sampling Plan Cavity #1 Die Cycle #1 Monday Wednesday Friday Die Cycle #2 Die Cycle #3 Cavity #2 Cavity #3 Cavity #4 To continue with this example the Multi-Vari sampling plan will be to gather data for 3 die cycles on 3 different days for 4 widgets inside the mold. 3 x 3 x 4 equals 36 data points. If you find this initial sampling plan does not show the variation of interest it will be necessary to continue sampling or to make changes to the sampling plan.

Within Unit Monday Wednesday Friday Within-Unit Encoding Die Cycle #1
Cavity #1 Cavity #2 Cavity #3 Cavity #4 Comparing individual data points within a die cycle is Within Unit Variation. Examples of measurement could be wall thickness, diameter or uniformity of thickness to name a few.

Unit to Unit Monday Wednesday Friday Between-Unit Encoding Cavity #1
Die Cycle #1 Monday Wednesday Friday Die Cycle #2 Die Cycle #3 Cavity #2 Cavity #3 Cavity #4 Comparing the averages from each die cycle is called Between Unit Variation.

Temporal Monday Wednesday Friday Temporal Encoding Cavity #1
Die Cycle #1 Monday Wednesday Friday Die Cycle #2 Die Cycle #3 Cavity #2 Cavity #3 Cavity #4 Comparing the average of all the data within a day and plotting 3 time periods is known as Temporal Variation.

Using Multi-Vari to Narrow X’s
List potential X’s and assign them to one of the families of variation. This information can be pulled from the X-Y Matrix of the Measure Phase. If an X spans one or more families assign %’s to the supposed split. Now let’s use the same information from the X-Y Matrix created in the Measure Phase. The following exercise will help you assign one of the variables to the family of variation. If you find yourself with a variable or X then assign percentages to split. Use your best judgment for the splits. Do not assume the true X’s causing variation have to come from one in the list.

Using Multi-Vari to Narrow X’s
Graph the data from the process in Multi-Vari form. Identify the largest family of variation. Establish statistical significance through the appropriate statistical testing. Focus further effort on the X’s associated with the family of largest variation. Remember the goal is not only to figure out what it is but also what it is not! These steps are useful when using Multi-Vari to narrow X’s: First, Graph the data from the process in Multi-Vari form. Second, Identify the largest family of variation. Third, Establish statistical significance through the appropriate statistical testing. Fourth, further effort on the X’s associated with the family of largest variation.

Data Worksheet Now let’s take the Multi-Vari concept and utilize MINITABTM. First Open the MINITABTM Project “Analyze Data sets.mpj” and select the worksheet “MVInjectionMold.mtw”. Now create the Multi-Vari Chart in MINITABTM. After you create the graph as indicated take a few minutes to create graphs using a different order. Always use the graph that shows the variation in the easiest manner to interpret.

Run Multi-Vari Here is the graph that should have been generated.

Identify The Largest Family of Variation
To find an example of Within Unit Variation look at Unit 1 in the second time period. Notice the spread of data is 0.07. Now let’s try to find Between Unit Variation. Compare the averages of the units within a time period. All three time periods appear similar so looking at the first time period it appears the spread of the data is 0.18 units. To determine Temporal Variation, compare the averages between time periods. It appears time period 3 and 2 have a difference of 0.06. Notice the shifting from unit to unit is not consistent but it certainly jumps up and down. The question at this point should be: Does this graph represent the problem I am working on? Do I see at least 80% of the variation? Read the units off the Y axis or look in the worksheet. Notice the spread of the data is 0.22 units. If the usual spread of the data is 0.25 units then this data set represents 88% of the usual variation which tells us our sampling plan was sufficient to detect the problem.

Die Cycle to Die Cycle – Something is Changing!
Root Cause Analysis Focus further effort on the X’s associated with the family of greatest variation. Die Cycle to Die Cycle – Something is Changing! After the analysis we now know the largest source of variation is occurring die cycle to die cycle. So we can focus our effort on those X’s we suspect have the greatest impact. In this case the pattern of variation is not consistent within the small scope of data we gave gathered. Additional data may be required or this process may be ready for experimentation.

Call Center Example A company with two call centers wants to compare two methods of handling calls at each location at different times of the day. One method involves a team to resolve customer issues, and the other method requires a single subject-matter expert to handle the call alone. Output (Y) Call Time Input (X) Call Center (GA,NV) Time of Day (10:00, 13:00, 17:00) Method (Expert, Team) Let’s try another example; open the MINITABTM worksheet “CallCenter.mtw”. This example is a transactional application of the tool. In this particular case a company with two call centers wants to compare two methods of handling calls at each location at different times of the day. One method involves a team to resolve customer issues and the other method requires a single subject-matter expert to handle the call alone.

Which is causing the greatest variation… Time? Method? Location?
Call Center Example Which is causing the greatest variation… Time? Method? Location? Please read the slide.

Is the largest source of variation more or less obvious?
Call Center Example Is the largest source of variation more or less obvious? Notice the Multi-Vari graph plotted is dependent on the order in which the variable column names are entered into MINITABTM .

Call Center Example This example is not as easy to draw conclusions because of the source of the data. With the injection molding process we know we are making the same parts over and over. However in this example of a call center there is no control over the nature of calls coming in so a single Outlier could affect your judgment.

Call Center Example To display individual data points click the “Options…” button. This helps to see the quantity of data and to identify unusually long or short calls. It is not necessary to force fit any one tool to your project. For transactional projects Multi-Vari may be difficult to interpret purely graphically. We will re-visit this data set later when working through Hypothesis Testing.

Open file named “MVA Cell Media.MTW”.
Multi-Vari Exercise Exercise objective: To practice Six Sigma techniques learned to date in your teams. Open file named “MVA Cell Media.MTW”. Perform Capability Analysis; use the column labeled volume. There is only an upper specification limit of 500 ml. ? Are the data Normal? _______ Is the process Capable? _______ What is the issue that needs work in terms of Six Sigma terminology? Shift Mean? _______ Reduce variation? _______ Combination of Mean and variation? _______ Change specifications? _______ Exercise.

Check for Normality… Is that normal? MVA Solution
Do you recall the reason why Normality is an issue? Normality is required if you intend to use the information as a predictive tool. Early in the Six Sigma process there is no reason to assume your data will be Normal. Remember, if it is not Normal it usually makes finding potential causes easier. Let’s work the problem now. First check the data for Normality. Since the P-value is greater than 0.05 the data are considered Normal.

Another method to check Normality is…
MVA Solution Another method to check Normality is… Having a graphical summary is quite nice since it provides a picture of the data as well as the summary statistics. The graphical summary command in MINITABTM is an alternative method to check for Normality. Notice that the P-value in this window is the same as the previous. Notice even though the data are Normal the distribution is quite wide. If you had a process where you were filling bottles would you not expect the process to be Normal?

MVA Solution Now it is time to perform the Process Capability. For subgroup size enter 12 since all 12 bottles are filled at the same time. Also, use 500 milliliters as the upper spec limit in order to see how bad the Capability was from a manufacturer’s prospective. Under the “Options…” tab you can select the “Benchmark Z’s (sigma level)” of the process or you can leave the default as “Capability stats (Cp, Pp)”. Just for fun you can run MINITABTM to generate the Capability Analysis using 500 as the upper spec limit then run it again as the lower spec limit to see what happens to the statistics.

REDUCE VARIATION!! - then shift Mean
MVA Solution REDUCE VARIATION!! - then shift Mean Is this process in trouble? The answer is yes since the Z bench value is negative! That is very bad. To correct this problem the process has to be set in such a manner that none of the bottles are ever under-filled while trying to minimize the amount of overfill. To answer step three of this exercise it is a combination of reducing variation and shifting the Mean. The Mean cannot be shifted however until the variation is reduced dramatically.

Perform a Multi-Vari Analysis
MVA Solution Perform a Multi-Vari Analysis The order in which you enter the factors will produce different graphs. The “classical” method is to use Within, Between and over-time (Temporal) order.

What is the largest source of variation?
MVA Solution What is the largest source of variation? The graph shows variation within a unit is consistent across all the data. The variation between units also looks consistent across all the data. What seems to stand out is the machine may be set up differently from first shift to second. That should be easy to fix! What is the largest source of variation? Within Unit Variation is the largest while Temporal is the next largest (and probably easiest to fix) as Between Unit Variation comes in last. So to fix this process your game plan should be based on the information in the Excel file and involve additional information you have about the process. This example was based on a real process where the nasty culprit was actually the location of the in-line scale. No one wanted to believe a high price scale could be generating significant variation. The in-line scale weighed the bottles and either sent them forward to ship or rejected them to be topped off. The wind generated by the positive pressure in the room blew across the scale making the weights recorded fluctuate unacceptably. The filling machine was actually quite good although there were a few adjustments made once the variation from the scale was fixed. Once the variation in the data was reduced they were able to shift the Mean closer to the specification of 500 ml.

Data Collection Sheet The data used in the Multi-Vari Analysis must be balanced for MINITABTM to generate the graphic properly. The injection molding data collection sheet was created as follows: 3 time periods 4 widgets per die cycle 3 units per time period Remember the data used in the Multi-Vari Analysis must be balanced for MINITABTM to generate the graphic properly. The injection molding data collection sheet was created to include: 3 time periods 4 widgets per die cycle 3 units per time period for a total of 36 rows of data. (3 times 4 times 3)

Data Collection Sheet The data sheet is now balanced meaning there is an equal number of data points for each condition in the data table and ready for data to be entered. If you were to label the units 1 – 9 instead of 1 – 3 per time period, MINITABTM would generate an error message and would not be able to create the graphic. Think in terms of generic units instead of being specific in labeling.

Classes of Distributions
Multi-Vari is a tool to help screen X’s by visualizing three primary sources of variation. Later we will perform Hypothesis Tests based on our findings. At this point we will review classes and causes of distributions that can also help us screen X’s to perform Hypothesis Tests. Normal Distribution Non-normality – 4 Primary Classifications Skewness Multiple Modes Kurtosis Granularity Please read the slide.

The Normal (Z) Distribution
Characteristics of Normal Distribution (Gaussian curve) are: It is considered to be the most important distribution in statistics. The total area under the curve is equal to 1. The distribution is mounded and symmetric; it extends indefinitely in both directions approaching but never touching the horizontal axis. All processes will exhibit a Normal curve shape if you have pure random variation (white noise). The Z distribution has a Mean of 0 and a Standard Deviation of 1. The Mean divides the area in half, 50% on one side and 50% on the other side. The Mean, Median and Mode are at the same data point. +6 -1 -3 -4 -5 -6 -2 +4 +3 +2 +1 +5 Please review the characteristics of the Gaussian curve shown here…

Why do we care? Normal Distribution
ONLY IF we need accurate estimates of Mean and Standard Deviation. Our theoretical distribution should MOST accurately represent our sample distribution in order to make accurate inferences about our population. This Normal Curve is NOT a plot of our observed data!!! This theoretical curve is estimated based on our data’s Mean and Standard Deviation. Many Hypothesis Tests are available assume a Normal Distribution. If the assumption is not satisfied we cannot use them to infer anything about the future. However, just because a distribution of sample data looks Normal does not mean the variation cannot be reduced and a new Normal Distribution created.

Non-Normal Distributions
1 Skewed 2 Kurtosis 3 Multi-Modal 4 Granularity Data may follow Non-normal Distributions for a variety of reason or there may be multiple sources of variation causing data that would otherwise be Normal to appear not Normal.

Skewness Classification
Potential Causes of Skewness 1-1 Natural Limits 1-2 Artificial Limits (Sorting) 1-3 Mixtures 1-4 Non-Linear Relationships 1-5 Interactions 1-6 Non-Random Patterns Across Time Right Skew Left Skew 4 5 6 7 8 9 10 11 20 30 40 50 60 Frequency 15 When a distribution is not symmetrical then it’s Skewed. Generally a Skewed Distribution longest tail points in the direction of the Skew.

Mixed Distributions 1-3 Mixed Distributions occur when data comes from multiple sources that are supposed to be the same yet are not. Sample A Sample B Combined Machine A Operator A Payment Method A Interviewer A Machine B Operator B Payment Method B Interviewer B + = What causes Mixed Distributions? Mixed Distributions occur when data comes from several sources that are supposed to be the same but are not. Note both distributions that formed the combined Skewed Distribution started out as Normal Distributions.

1-4 Non-Linear Relationships
Non-Linear Relationships occur when the X and Y scales are different for a given change in X. Marginal Distribution of X 1 5 X Y Marginal Distribution of Y Just because your Input (X) is Normally Distributed about a Mean the Output (Y) may not be Normally Distributed.

1-5 Interactions Interactions occur when two inputs interact with each other to have a larger impact on Y than either would by themselves. With Fire No Fire 35 30 25 Interaction Plot for Process Output Room Temperature Spray No Spray Aerosol Hairspray On Off If you find two inputs have a large impact on Y but would not effect Y by themselves, this is called a Interaction. For instance if you spray an aerosol can in the direction of a flame what would happen to room temperature? What do you see regarding the following distributions?

1-6 Time Relationships / Patterns
The distribution is dependent on time. Often seen when tooling requires “warming up”, tool wear, chemical bath depletions, ambient temperature effect on tooling. Time 10 20 30 40 50 25 Marginal Distribution of Y Time relationships occur when the distribution is dependent on time. Some examples are tool wear, chemical bath depletion, stock prices, etc.

Non-Normal Right (Positive) Skewed
Moment coefficient of Skewness will be close to zero for symmetric distributions, negative for left Skewed and positive for right Skewed. To measure Skewness we use Descriptive Statistics. When looking at a symmetrical distribution Skewness will be close to zero. If the distribution is Skewed to the left it will have a negative number. If Skewed to the right, it should be positive.

Peaked with Long-Tails
Kurtosis Kurtosis refers to the shape of the tails. Leptokurtic Platykurtic Different combinations of distributions causes the resulting overall shapes. Peaked with Long-Tails Flat with Short-Tails The next classification of Non-normal Data is Kurtosis. There are two types of Kurtosis are Leptokurtic and Platykurtic. Leptokurtic is generally peaked with long-tails while Platykurtic are flat with short-tails.

Platykurtic Multiple Means shifting over time produces a plateau of the data as the shift exhibits this shift. Causes: 2-1. Mixtures: (Combined Data from Multiple Processes) Multiple Set-Ups Multiple Batches Multiple Machines Tool Wear (over time) 2-2 Sorting or Selecting: Scrapping product that falls outside the spec limits 2-3 Trends or Patterns: Lack of Independence in the data (example: tool wear, chemical bath) 2-4 Non Linear Relationships Chemical Systems Negative coefficient of Kurtosis indicates Platykurtic distribution.

Leptokurtic Causes: 2-1. Mixtures: (Combined Data from Multiple Processes) Multiple Set-Ups Multiple Batches Multiple Machines Tool Wear (over time) 2-2 Sorting or Selecting: Scrapping product that falls outside the spec limits 2-3 Trends or Patterns: Lack of Independence in the data (example: tool wear, chemical bath) 2-4 Non Linear Relationships Chemical Systems Distributions overlaying each other that have very different variance can cause a Leptokurtic distribution. Positive Kurtosis value indicates Leptokurtic distribution.

Reasons for Multiple Modes:
3-1 Mixtures of distributions (most likely) 3-2 Lack of independence – trends or patterns 3-3 Catastrophic failures (example: testing voltage on a motor and the motor shorts out so we get a zero reading) Now that’s my kind of mode!! Multiple Modes have such dramatic combinations of underlying sources they show distinct Modes. They may have shown as Platykurtic but were far enough apart to see separation. Celebrate! These are usually the easiest to identify causes.

Bimodal Distributions
2 Different Distributions 2 different machines 2 different operators 2 different administrators This is an example of a Bi-Modal Distribution. Interestingly each peak is actually a Normal Distribution but when the data is viewed as a group it is obviously not Normal.

Extreme Bi-Modal (Outliers)
If you see an extreme Outlier it usually has its own cause or own source of variation. It is relatively easy to isolate the cause by looking on the X axis of the Histogram.

Bi-Modal – Multiple Outliers
Having multiple Outliers is more difficult to correct. This action typically means multiple inputs.

Granular data is easy to see in a Dot Plot.
Use Caution! It looks “Normal” but it is only symmetric and not Continuous. Causes: 4-1 Measurement system resolution (Gage R&R) 4-2 Categorical (step-type function) data Now let’s take a moment to notice the P-value in the Normal Probability Plot, it is definitely smaller than 0.05! There simply is not enough resolution in the data.

Notice the contrast to the previous page!
Normal Example Notice the contrast to the previous page! Please read the slide.

Conclusions Regarding Distributions
Non-normal Distributions are not BAD!!! Non-normal Distributions can give more Root Cause information than Normal data (the nature of why…) Understanding what the data is telling us is KEY!!! What do you want to know ??? Find the key…. Here is what to conclude regarding distributions.

At this point you should be able to:
Summary At this point you should be able to: Perform a Multi-Vari Analysis Create and interpret a Multi-Vari Graph Identify when a Multi-Vari Analysis is applicable Interpret how Skewed data looks Explain how data distributions become Non-normal when they are really Normal Please read the slide.

Analyze Phase “X” Sifting

Similar presentations

Presentation on theme: "Analyze Phase “X” Sifting"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Analyze Phase “X” Sifting

Similar presentations

Presentation on theme: "Analyze Phase “X” Sifting"— Presentation transcript:

Similar presentations

About project

Feedback