Applied Statistics for Advanced Applications

Applied Statistics for Advanced Applications
Thoratec Workshop in Applied Statistics for QA/QC, Mfg, and R+D Part 3 of 3: Advanced Applications Instructor : John Zorich Part 3 was designed for students who have taken Part 1 and Part 2 of these workshops, or who have had a college-level statistics course. © 2008 by ZTC -- 1 1

John Zorich's Qualifications:
20 years as a "regular" employee in the medical device industry (R&D, Mfg, Quality) ASQ Certified Quality Engineer (since 1996) Statistical consultant+instructor (since 1999) for many companies, including Siemens Medical, Boston Scientific, Stryker, and Novellus Instructor in applied statistics for Ohlone College (CA), Pacific Polytechnic Institute (CA), and KEMA/DEKRA Past instructor in applied statistics for UC Santa Cruz Extension, ASQ Silicon Valley Biomedical Group, & TUV . Publisher of 9 commercial, formally validated, statistical application Excel spreadsheets that have been purchased by over 80 companies, world wide. Applications include: Reliability, Normality Tests & Normality Transformations, Sampling Plans, SPC, Gage R&R, and Power. You’re invited to “connect” with me on LinkedIn. © 2008 by JOHN ZORICH, 2 2 2 2

Self-teaching & Reference Texts
RECOMMENDED by John Zorich Dovich: Quality Engineering Statistics Dovich: Reliability Statistics Juran: Juran's Quality [Control] Handbook Natrella: Experimental Statistics (<< recently re-published) NIST Engineering Statistics Internet Handbook, found at Pyzdek: Quality Engineering Bible Taylor: Guide to Acceptance Sampling Tobias & Trindade, Applied Reliability Wheeler: Understanding Statistical Process Control Zimmerman: Statistical Quality Control Using Excel AIAG: Statistical Process Control (SPC)

Main Topics in Today's Workshop
Reliability Plotting Statistical Analysis of Gages QC Sampling Plans Statistical Process Control (SPC) (SPC) Process Capability Indices This is a lot to cover in 1 day, but your studying the "Student" files at home, and Instructor accessibility by , complete the course.

(review of topic from part 2 of this course) Definitions of “Failure” and “Reliability”
In many of the slides in this section of the class, the words " Failure " and " Reliability " are used. By "Failure" is meant that an individual component or product has been put on-test or under inspection and has either not passed specification or has literally failed (e.g., broke, separated, or burst -- it may have passed spec but then been taken past spec, until it eventually failed) --- which meaning is intended is obvious (or should be !!) in each situation. "Failure Rate" refers to the % of a lot or sample that has failed in testing, so far (that is, up to a given stress level). By "Reliability" is meant the % of the lot that does not exhibit "failure" (Reliability = 100% minus the Failure Rate), AT OR BELOW A SPECIFIC STRESS LEVEL.

Reliability Plotting Cumulative % DATA
Typically, reliability data is not linear. Methods of extrapolating curved lines are not recommended because they are "not well understood" (Pyzdek), and their mathematics are not widely discussed (e.g., not in Juran's Q-Handbook). Cumulative % 1 – Reliability Specification DATA 7

(review of topic from part 2 of this course) Definition of " F "
Reliability textbooks provide various transformation of the % Cumulative values, so that all data, even the "100%" point, can be plotted onto the Y-axis. In textbooks on Reliability Statistics, the transformation suggested is typically the " F " value. To calculate " F " for a given set of data, first sort all the values, then give each value a "rank" number ("rank" of the value with the lowest magnitude = 1, next = 2, and so on). A commonly used formula for F is ... F = Median Rank = ( Rank – 0.3 ) / ( SampleSize ) A “more accurate and theoretically justified” calculation (per one of the authors of Applied Reliability) is (using Excel)... F = BETAINV ( 0.5 , rank , SampleSize – Rank + 1 )

Transformed Specification
Reliability Plotting …requires a transformation that gives a straight line (by Linear Regression) that can be extrapolated to the Specification value. The "confidence limit" (the hollow triangle) on the extrapolated point (the solid triangle) can be calculated by a simple but long formula found in advanced texts. It is provided in modified form on the next slide. Transformed "F" at 95% Confidence Transformed " F " Transformed Specification Transformed DATA

Formula for calculating the 1-sided confidence limit on the plotted Y-value at a single point on a linear regression line (i.e., the transformed Y-value for hollow triangle on previous slide) (note: The generalized linear regression equation is... Yei = a + b • Xi ) = Ysl +/− t x See x [(1 / N )+(( Xsl − Xavg )^2) / (Sum(( Xi − Xavg)^2 ))]^0.5 where... Ysl = Y-axis transformed “F” value corresponding to the Specification Limit (i.e., transformed Y-value for solid triangle on the chart on previous slide) +/− = use “+” if out-of-specification is below the spec limit; otherwise use “–” t = one-side, t-Table value at alpha = 1 – Confidence, and df = N – 2 using Excel, t = TINV ( 2 * (1 – Confidence) , ( N – 2 ) ) See = Std Error of Estimate = [ ( Sum ( ( Yei – Yi ) ^2 ) ) / ( N – 2 ) ] ^0.5 Yi = transformed plotted “F” value corresponding to a plotted Xi N = number of X,Y points plotted on the chart (not the same as sample size) Xsl = transformed Specification Limit Xavg = average of the transformed X values of the plotted X,Y points (in some cases, this is not the average of the transformed raw data) (do not include the specification limit in this average) Xi = each of the transformed X values of the plotted X,Y points "Sum" here means to add up each (Xi – Xavg)^2, from i = 1 thru i = N

Examples of useful X-axis Transformations ( using Excel Formulas )
In the formulas below, A, B, D, and E are constants (negative or positive, whole numbers or fractional) chosen to help linearize the reliability plot. = 1 / X = SQRT ( X ) = ASINH ( SQRT ( X ) ) = SQRT ( X + A ) = ( ( X ^ B ) – 1 ) / B << this is called the Box-Cox Transformation = LN ( X + D ) = 1 / ( X + E ) The following can be used only with X values between 0 and 1: = LN ( X / ( 1 – X ) << this is called the Logit Transformation = ASIN ( SQRT ( X ) ) = 0.5 * LN ( ( 1 + X ) / ( 1 – X ) << this is called the Fisher Transformation

Examples of useful Y-axis Transformations ( using Excel Formulas )
In the formulas below, “F” = the calculated Median Rank and “C” = the user-chosen “shape parameter” constant ** = formula has been standardized by setting other shape parameters to a value of 1.000 = NORMSINV ( F ) << this is the “Normal” (Z-table) transformation = NORMSINV [ 1 − (1 − F ) ^ ( 1 / C ) ] << Power Normal = EXP ( NORMSINV ( F ) ) << Three Parameter LogNormal ** = EXP ( NORMSINV [ 1 − (1 − F ) ^ ( 1 / C ) ] ) << Power LogNormal ** = LN ( LN ( 1 / ( 1 − F ) ) ) << Smallest Extreme Value = LN ( 1 / ( 1 − F ) ) ^ ( 1 / C ) << Weibull ** = “Exponential” when C = 1 = LN ( F / ( 1 − F ) ) << Logistic = LN ( 1 / ( LN ( 1 / F ) ) ) << Largest Extreme Value = TAN ( PI() * ( F − 0.5 ) ) << Cauchy distribution

Reliability Plotting Reliability plotting allows calculation of confidence and reliability based on either small sample sizes -- unfinished experiments -- data that can't be normalized -- data from different populations -- data with many duplicates as we shall see next... -- unfinished experiments -- data that can't be normalized -- data from different populations -- data with many duplicates Transformed " F " Not possible with K-tables Transformed DATA

Actual data from presenter's client...

Is "almost" good enough for critical products?
continued from previous slide... In reliability statistics textbooks, a plot like this, or one that is not even as straight as this, is sometimes shown as an example of a “Normal" distribution; but... even tho this data does “pass” the best “tests” for Normality (Anderson-Darling A2*, Cramer-von Mises W2*, and Shapiro-Francia W' ), with test p-values all > 0.425, ... and even tho the correlation coefficient is very high... this plot is slightly curved; and therefore this data is not truly normal (it is almost Normal). Is "almost" good enough for critical products? df This is the Excel equivalent of a Normal Probability Plot (data is “Normal” if it shows as a straight line on this plot).

continued from previous slide...
The "inverse" ( = 1 / X ) transformation gives a much straighter line on "Normal Probability Plotting" paper, and so the distribution is "Inverse Normal" rather than "Normal" © 2008 by ZTC -- 16 16

Using the 12-pt data set from the previous slide...
Using Reliability Plotting to extrapolate transformed data to the transformed spec ( 1 / X = 1 / 5.5 = ), we have this result: Z(F) = – = % failure rate = % reliability at 95% confidence (solid triangle is the extrapolated value; the hollow triangle is the upper 1-tailed 95% confidence limit). By comparison, Normal K-tables yielded slightly less than 99.9 % reliability. Z (F) In John Zorich's view, Reliability Plotting is more accurate than Normal K-tables. In this case, we obtained a "better" result; but the reverse may occur on a different data set. © 2008 by ZTC -- 17 17

Reliability Plotting: EXACT vs. INTERVAL
If you have replicate measurements in your data set (e.g., 4 data points each = 0.35), and if you plot each of the individual exact data points, the resulting line may be inappropriate. The reason it may be inappropriate is that "Linear Regression" (which is the mathematical tool used in Reliability Plotting) does not perform well when replicates are present (especially when the replicates are near one or the other of the ends of the straight line). Instead of using your individual "exact" data, it may be better to pool identical values (or very similar values) into groups ("intervals"), the way you do for a histogram. Then calculate the % cumulative for each of the cumulative groups. © 2008 by ZTC -- 18 18

Reliability Plotting: EXACT vs. INTERVAL
EXACT %F % % % % % % . % % % % © 2008 by ZTC -- 19 19

See next slide for this data plotted.
Reliability Plotting: EXACT vs. INTERVAL (%F and %Cumulative are calculated differently) EXACT %F INTERVAL %Cumulative % % % % % % % % % % % % % % % % In cases such as this, plotting INTERVAL values will produce a more accurate line than plotting EXACT values, but you may need to "censor" the largest value. See next slide for this data plotted. © 2008 by ZTC -- 20 20

Reliability Plotting: EXACT vs
Reliability Plotting: EXACT vs. INTERVAL (continued from previous slide) © 2008 by ZTC -- 21 21

Burst strength ( actual data !!)
60 devices tested; the minimum spec was psi. The (sorted) raw data was... © 2008 by ZTC -- 22 22

Burst strength Using Reliability Plotting.xls using Z(F) vs.
X(untransformed) (this is equal to Normal Probability Plotting paper) Because this data does NOT form a straight line on NPP paper, it's not valid to use K-factor tables. © 2008 by ZTC -- 23 23

Also, notice that the data includes many replicate values...
Burst strength Also, notice that the data includes many replicate values... © 2008 by ZTC -- 24 24

Burst strength ...and notice that on a basic cumulative plot ( = F(untransformed) vs. X(untransformed) ) the data seem to include 2 different populations: A single population would look like a smooth " S " curve. This one has a break or corner in it, indicating a dual population. To use Reliability Plotting, must "censor" these data, (they appear as “shoulder” on a line chart -- see next slide) © 2008 by ZTC -- 25 25

Frequency Distribution
(continued from previous slide) Mixtures of distributions appear as bi-modal frequency distributions, or as a single mode with a shoulder, like this: Frequency Distribution "Shoulder" (the data that must be "censored", to use Reliability Plotting)

Burst strength Here is how to convert the ( n = 60 ) data, with its many replicate values, into interval data: = 2/60 = 3.3 % = 3/60 = 5.0 % = 4/60 = 6.7 % = 5/60 = 8.3% = 9/60 = 15.0% etc. For "Interval" plots, must use % cumulative, not "F" Do NOT plot zero values e.g., do not plot © 2008 by ZTC -- 27 27

Burst strength If convert from "exact" to "interval" AND censor (i.e., do not plot) values above 0.8 psi, then... 95% confidence at % Reliability using Log(X) vs Z(%cum) (i.e., LogNormal) Important !! This pairing IS the best straight line but does NOT have the highest CC. Log(X) vs Z(%cum) (i.e., LogNormal) © 2008 by ZTC -- 28 28

Statistical Analysis of Gages
Calibration, Metrology, & Measurement Uncertainty © 2008 by ZTC -- 29 29

Regulatory Requirements...
ISO 9001 & "The organization shall determine the monitoring and measurement to be undertaken and the monitoring and measuring devices needed to provide evidence of conformity of product to determined requirements." MDD: Annex II, V, + VI "Application of the quality system must ensure that the products conform to the provisions of this Directive which apply to them at every stage, from design to final inspection....It shall include in particular an adequate description of...the test equipment used; it must be possible to trace back the calibration of the test equipment adequately." Question: What does that word "adequately" mean? The combination of calibration records AND the process of choosing calibrated equipment must "provide evidence of conformity”. If the wrong instrument is chosen, it provides no “evidence of conformity”, even if it is “calibrated”. © 2008 by ZTC -- 30 30 30 30

Vocabulary ACCURACY is defined using the mean of several measurements. Subtracting that mean from the "true value" gives the “Inaccuracy” or “Bias”. Divide Inaccuracy by the “true value", then multiply by 100, to yield the "% Inaccuracy". Commonly, accuracy is assessed by taking only a single measurement; and if that measurement is within the tolerance allowed, then the instrument is said to be “within tolerance”. Calibration vendors use N = 1, i.e., a single measurement, unless explicitly told not to. DURING THIS PRESENTATION, THINK ABOUT WHETHER OR NOT THIS SECOND METHOD IS ACCEPTABLE IN A MEDICAL DEVICE COMPANY That "N=1" may NOT be a good thing, because accuracy cannot be determined accurately without taking multiple readings !! © 2008 by ZTC -- 31 31 31 31

Vocabulary PRECISION (also called “repeatability”, especially in the “specification” section of measurement equipment “owners manual”) is assessed by taking several measurements of the same item, and calculating their standard deviation (= the “Imprecision”). Divide Imprecision by the “true value", then multiply by 100, to yield the "% Imprecision". Typically, calibration vendors do not check for precision, unless you explicitly tell them to. DURING THIS PRESENTATION, THINK ABOUT WHETHER OR NOT THIS SECOND METHOD IS ACCEPTABLE IN A MEDICAL DEVICE COMPANY This may NOT be a good thing, if product pass/fail decisions are based upon a single measurement (precision tells you how reliable is a single measurement) !! © 2008 by ZTC -- 32 32 32 32

Vocabulary RESOLUTION is equivalent to the number of digits you can read on an instrument, e.g. an instrument that can output this measurement inches has "higher resolution" than one that can output only this inches Unfortunately, there is no reliable relationship between resolution and accuracy and/or precision !! However, when there is no other info available about measurement uncertainty, the standard deviation of repeated measurements is estimated as follows: The width of the “smallest readable unit” (e.g., and in the examples above) divided by 3.464 NOTE: some micrometers read only 0 or 5 in the last digit, in which case, the “smallest readable unit” is not “ 1 ” as in the examples above, but rather “ 5 ”. The instrument could be 5 days old or 5 years old. © 2008 by ZTC -- 33 33 33 33

Uncertainty There is always some uncertainty as to the degree with which a sample represents the population from which it was drawn. That type of uncertainty cannot be reduced by anything other than larger sample sizes. In addition to that uncertainty, there is uncertainty caused by the measurement process itself. The science of "Metrology" aims to quantify, control, and reduce the uncertainty caused by the measurement process. © 2008 by ZTC -- 34 34 34 34

Some Tools of Metrology are...
Calibration Assesses & corrects accuracy & precision Gage R&R Assesses precision only Gage Correlation Assesses accuracy only, vs. another device/person Gage Linearity & Bias Assesses accuracy only, vs. a gold std Gage Stability (i.e., instability) Studies Assesses systematic drift in accuracy over time Uncertainty Budgets Summarizes available measurement uncertainty info, and then suggests how QC specs should be modified. Not discussed in this seminar © 2008 by ZTC -- 35 35 35 35

Calibration concerns Before setting a design specification, a company should decide on a desired relationship between specification tolerance & measurement equipment accuracy, e.g. ... Product tolerance specification = target +/‒ 4 units Equipment calibrated accuracy = nominal +/‒ 1 unit This situation above ( a ratio of 4 : 1 ) is generally considered acceptable, since it mimics the practice that ISO mandates for Calibration companies. © 2008 by ZTC -- 36 36 36 36

Calibration concerns Product tolerance specification = 100 +/‒ 4 units
SOME MEDICAL DEVICE COMPANIES UNKNOWINGLY HAVE RATIOS OF 1 : 1 , FOR EXAMPLE... Product tolerance specification = 100 +/‒ 4 units Equipment calibrated accuracy = 100 +/‒ 4 units EXAMPLE OF WORST-CASE SCENARIO WITH THAT RATIO Calibration data: NIST-traceable standard of 100 reads 96 (= meets requirement, but equipment is reading 4 units low). Equipment is then used to measure product; if result is 104, it passes spec, but "true" value is really 108. Thus, product should be rejected but instead is passed. © 2008 by ZTC -- 37 37 37 37

Gage R&R A "Gage R&R" study quantitates measurement uncertainty that is due to the combination of the instrument & users. Typically, the output is the width of the interval that includes the middle 99% of the "Normal" distribution of individual measurements. Let's call that the "Uncertainty Interval". In an R&R study, we primarily identify the... Total Variation uncertainty interval (from all causes) Repeatability uncertainty interval (this is uncertainty caused by the inability of a measurement instrument [i.e., the gage] to produce the same measurement result when used repeatedly to measure the identical part) Reproducibility uncertainty interval (caused by the inability of different users of the same gage to produce the same result when measuring the identical part) © 2008 by ZTC -- 38 38 38 38

Equipment Control This is an example of a data input table for a simple Gage R&R study (a complicated one involves more than one gage). Typically, analysis of this data requires a computer program capable of the quite-involved Gage R&R calculations. © 2008 by ZTC -- 39 39 39 39

Uncertainty Uncertainty This shows that gages+people (i.e., repeatability+reproducibility) cause variation that consumes / 40 = 32.5 % of the QC Spec Interval. In this case, re-training and/or standardizing personnel practices will help only a little to decrease that %. To decrease variation significantly, we need to buy better gages (gages caused most of this R&R variation, i.e., 32.2 % vs. 4.3 % ). © 2008 by ZTC -- 40 40 40 40

Gage R&R using Excel’s “Data Analysis” “Add-in” “Option”, on the “DATA” tab: “ANOVA: Two Factor With Replication” Reproducibility (99%) = x StdDev( 4.63, 4.42, 4.65 ) Repeatability (99%) = x sqrt [ ( – – 0.94 ) / ( 89 – 9 – 2 ) ] Gage R&R (99%) = sqrt ( Reproducibility2 + Repeatability2 ) © 2008 by ZTC -- 41 41 41 41

Gage Correlation A "Gage Correlation" study typically is used to compare measurements of identical parts by 2 different companies --- for example, by the Supplier of the part, and their Customer. One practical use is to validate that the Supplier gets the "same" answer as the Customer, and thus the Customer justifies using Supplier-provided QC data rather than the Customer having itself to perform QC (the part could then justifiably go "dock to stock"). In a Gage Correlation study, we identify the... Linear Regression relationship between the measurements by the 2 companies Correlation Coefficient for that linear regression Offset Values that could be used to "correct" any identified differences in measurement between the 2 companies. Such a study could also evaluate R&D vs. Pilot Production, or Pilot Production vs. Manufacturing, in the same company. © 2008 by ZTC -- 42 42 42 42

Gage Correlation This is an example of a data input table for a simple Gage Correlation study (a complicated one would involve 3 or more gages). In this case, each "Set #" is a unique part to be measured by "Gage # 1" at one company (or department), and "Gage # 2" at the other company (or department). © 2008 by ZTC -- 43 43 43 43

(based on equation shown on previous slide).
For Gage#1 to read like Gage #2, multiply each Gage# 1 result by and then subtract from that result... ≈ 4.77 (based on equation shown on previous slide). © 2008 by ZTC -- 45 45 45 45

Gage Linearity A "Gage Linearity" study typically is used to evaluate performance over a wide range of values (e.g., over the entire range of values that the gage is capable of measuring). In a Gage Linearity study, we... Use "gold standards" or any parts for which we believe we know the "true" answers very accurately Make repeated measurements of the gold standards, using a single on-test gage Graph the "error" (a.k.a., "bias") for each measurement (i.e., how far off each measurement is from the "true" value) Determine if the error is statisticly significantly different from (i.e., if there is no error, then there is no bias) If the gage is found to have (statisticly) "no" bias thruout its tested range, we say the gage has acceptable "linearity" in that range. © 2008 by ZTC -- 46 46 46 46

Consider the curved lines to be 95% confidence limits on the sample result avgs, in the measurement range of 2 thru 10 (formulas for such limits are found in advanced stat books). Because the horizontal "Y = Bias" line is fully contained by the solid curved confidence interval lines, we conclude that this gage has acceptable linearity in the range of 2 thru 10. Bias = error 0.000 © 2008 by ZTC -- 48 48 48 48

Gage Bias A "Gage Bias" study is, in effect, a one-point Gage Linearity Study, in which is used either a gold standard ("Reference") calibrator or a gold standard ("Reference") gage. The difference between the on-test gage and either the gold-standard gage or the gold-standard calibrator is considered the "bias". The virtue of a Gage Bias study is that it is simple & quick --- it uses one person, one (on-test) gage, and one part measured several times at one sitting. Its analogous to a one-point calibration. See output of this study, on the next slide >> © 2008 by ZTC -- 49 49 49 49

Gage Linearity vs. Bias vs. Calibration
Value out-putted by On-Test gage True Value (= calibration standard or reference gage) Uncertainty is known in this range if all 3 levels are evaluated or calibrated. In either case, nothing is known of uncertainty for any point in this range is a given (a "tare point"), not a calibration pt. Uncertainty is known only here if only this 1 level is evaluated or calibrated 1-point calibration is OK, if only measure here for QC, Mfg., R&D etc. © 2008 by ZTC -- 51 51 51 51

Uncertainty Budget Estimate the standard deviation of uncertainty for each of the uncertainty sources for which you have information, e.g., for a 99% interval Gage R&R StdDev = GageR&R interval divided by 5.15 for a calibration tolerance (e.g., "Mfg's specs") StdDev = tolerance interval divided by 4.00 for uncertainty in the calibration calibrator (typical) StdDev = calibration tolerance divided by other (e.g., gage instability)...?? Square each StdDev, sum them, take square root of sum. Multiply that by factor for interval you wish to calculate (e.g., for 99%, factor = 5.15; for 95%, factor= 3.92 ); the result is called the "Expanded Uncertainty Interval" Divide the width of product's spec interval by that interval. There is general agreement in industry, based on ISO & NIST recommendations, that if that ratio is less than 4.00, for a 95% interval, then the measurement equipment or measurement process is NOT suitable for given product. © 2008 by ZTC -- 52 52 52 52

“Design” specification range
Graphical Summary of the Problem of & Solution to Measurement Uncertainty “Design” specification range 95% or 99% Expanded Uncertainty interval (based upon whatever is included in the "Uncertainty Budget") This “guard-banded” specification range has the advantage that any single measurement that falls within it is “guaranteed” to fall within the design specification range (  95% or 99% probability). Without "guard banding", the actual range being used to pass/fail measurement results is this "expanded specification". © 2008 by ZTC -- 53 53 53 53

What is acceptable, if the goal is to
What is acceptable, if the goal is to... "provide evidence of conformity of product to determined requirements" per ISO 9001 & ? There is no regulation or official guidance document that discusses uncertainty budgets, expanded specifications, and guard-banding (e.g., ISO says only that the “documented procedure should include details of equipment type, unique identification, location, frequency of checks, check method, and acceptance criteria”). Therefore, ISO, CE, & FDA auditors have no firm basis on which to force companies to implement metrology policies. Therefore, it is up to the company whether or not its product is QC’d vs. the “expanded specification interval” (which is always wider than the design-based specification interval). © 2008 by ZTC -- 54 54 54 54

Classic QC Sampling Plans (and their alternatives)
© 2008 by ZTC -- 55 55 55 55

Standards & Regulations
ISO 9001: ISO 13485:2003 §8.1: "[Mfg] shall...implement...analysis...processes needed to demonstrate [product/process] conformity....This shall include determination of applicable...statistical techniques". FDA's "GMP" (21CFR ) (re: medical devices): "Sampling plans...shall be...based on a valid statistical rationale... Each manufacturer shall...ensure that sampling methods are adequate for their intended use." FDA's "Medical Device Quality Systems Manual" "...all sampling plans have a built-in risk of accepting a bad lot. This sampling risk is typically determined in quantitative terms by deriving the 'operating characteristic curve' [which]...can be used to determine the risk a sampling plan presents. A manufacturer should be aware of the risks the chosen plan presents....A manufacturer shall be prepared to demonstrate the statistical rationale for any sampling plan used.” US Dept of Defense MIL-STD 1916 "...sampling inspection is...redundant...and...unnecessary." "...consider [using an] alternative acceptance method."

Basic Types of Sampling Plans
In an attribute sampling plan, "quality" is measured by the observed % of the sample that meets specification. In a variables sampling plan, "quality" is measured by the estimated % of the population that meets specification (based upon Sample Mean & either Sample Range or Std Deviation, & assuming data Normality (see statement of normality requirement, in ANSI/ASQC Z , pp. 2-3). Only attribute sampling plans are discussed in this class, because they are currently still the dominant ones used in the medical device industry (see next slide). © 2008 by ZTC -- 57 57

Information collected by John Zorich:
Virtually 100% of U.S. medical device companies use AQL Attribute sampling plans for their IQC inspections ( IQC = Incoming or Receiving Quality Control ) less than 1% use a "variables" sampling plan less than 1% use an LQL sampling plan. That conclusion is based upon John Zorich's history of... full-time quality-system (& statistical) consulting, 1999–2013 working halftime as an auditor for European ISO / Notified-Body registrars, TUV and KEMA/DEKRA, 2000–2013 performing more than 500 quality-system audits at more than 200 medical-device companies in USA, 1999–2013 © 2008 by ZTC -- 58 58 58 58

Attribute Sampling Plans
An attribute sampling plan is a written procedure for... choosing a fraction of an incoming lot (the fraction = the “sample”) deciding on the acceptability of the entire lot based on the observed quality of the sample (the lot "passes" if the number of defects or defective parts is not more than the " C " = "acceptance number" that is allowed by the plan) Sampling-plan-use involves a RISK of approving a “bad” lot (a risk to end-user customer, possibly). © 2008 by ZTC -- 59 59

“AQL” stands for “Acceptable Quality Level” or "Acceptance Quality Limit". The “%AQL” of an AQL sampling plan is the product quality ( = lots having that % defective) which the sampling plan will approve almost all the time (there is no generally accepted numerical definition of %AQL). %AQL = " I am happy with AQL% defective " “LQL” stands for “Limiting Quality Level” or "Lower Quality Limit". The “%LQL” of an LQL sampling plan is the product quality ( = lots having that % defective) which the sampling plan will reject almost all the time (there is no generally accepted numerical definition of %LQL). %LQL = " I'm not happy with LQL% defective" © 2008 by ZTC -- 60 60

≈ 99% of U.S. med device companies that do IQC inspection use one of these two plans:
ANSI/ASQC-Z1.4 = ISO = MilStd105E AQL attribute sampling plan, widely used because of it's explicit endorsement by the FDA, in its Medical Device Quality Systems Manual: "[Sampling] Plans should be developed by qualified mathematicians or statisticians, or be taken from established standards such as ANSI Z1.4" The plan's stated purpose "is not intended as a procedure for estimating lot quality or for segregating lots"...but rather to "induce a supplier to maintain a process average...[and to control ] consumer's risk...." Squeglia’s "Zero Acceptance Number Sampling Plans" AQL attribute sampling plan, widely used in industry, because of its smaller sample sizes & implicit endorsement by ASQ (it's published by the official ASQ Quality Press). The plan's stated purpose is "provide essentially equal or greater LQ protection at the 0.10 consumer's risk level". © 2008 by ZTC -- 61 61 61 61

Classic (AQL Attribute) QC Sampling Plans (are they worth the effort?)
© 2008 by ZTC -- 62 62 62 62

What people say about why they use traditional AQL sampling plans is...
"FDA / ISO auditors won't ask any challenging questions." That is true, for field auditors (= untrained in statistics). PMA / CE auditors and their staff statisticians are much more statistically savvy, and have been known to ask you to justify your sampling plans, based on risk analysis, for critical parts (e.g., implant components). "Such plans provide statistical assurance that... suppliers provide consistently high quality product, we are not accepting low quality product, & our Parts Storeroom has a known quality level." Let's now examine those 3 claims... © 2008 by ZTC -- 63 63 63 63

ANSI Z1.4

" Zero Acceptance Number Sampling Plans " ( by N. L. Squeglia, 4th ed

For lots of a given part #, when inspected using a given sampling plan, the % of lots (not the % of parts) that meet specification is called the "Pass Rate". The Pass Rate for a sampling plan is “always”... ( #1 ) high for “good” lots ( = have low % defectives) ( #2 ) low for “bad” lots ( = have high % defectives) ( #3 ) intermediate for lots of intermediate quality. © 2008 by ZTC -- 67 67

The manner in which lot quality and lot size affect the Pass Rate is described by 2 types of... “Operating Characteristic” curves = “OC” curves In this presentation, those 2 types of curves are called... % Defective OC Curves and Lot Size OC Curves examples of each are shown on upcoming slides... © 2008 by ZTC -- 68 68

OC CURVE found in text books, i.e., Lot % Defective vs. Pass Rate
Predicting Pass Rates This is the typical OC CURVE found in text books, i.e., Lot % Defective vs. Pass Rate “% Defective OC curve” for a 4% AQL sampling plan (ANSI Z1.4) OC curves "describe the long run behavior of a sampling plan. They do not tell the user what can be said about a particular lot that has just been accepted or rejected." D. J. Wheeler, 2006 in EMP III, pg. 152 N = 1000 n = 80 c = 7 © 2008 by ZTC -- 69 69 69 69

Predicting Pass Rates “% Defective OC curve” for
a 4% AQL, C=0 sampling plan (Squeglia's 4th edition) N = 1000 n = 15 c = 0 © 2008 by ZTC -- 70 70 70 70

Which "4% AQL" plan should be used?
How can such variations in "4% AQL" pass rates ensure that "Suppliers provide consistently high quality product "? Which "4% AQL" plan should be used? In order to focus on consumer risk (per FDA & Squeglia), need to focus on this LQL point. LQL sampling plans focus on % defective that will be rejected almost all the time. N = 1000 n = 80 or 15 c = 7 or 0 © 2008 by ZTC -- 71 71 71 71

Probability of Acceptance of a Single Lot from a Sequence of Lots from a Stable Process (MSExcel function) =binomdist(C,S,F,True) C = Number of defectives allowed in the sample S = Sample size F = Fraction of lot that is defective True = tells the program to add up the probabilities for 0, 1, 2, 3,.... thru to C. (continued on next slide) © 2008 by ZTC -- 72 72

(free) "Self-made Sampling Plans.xls"
© 2008 by ZTC -- 73 73

Control of Sampling Plan's Consumer Risks
It is NOT possible to use an AQL% to explain a "valid statistical rationale" for a sampling plan. The only way to achieve a "valid statistical rationale" for classic sampling plans is to... review the Risk Management documents (e.g., FMEA), to determine if "IQC" processes have been identified as being a "mitigation"; if they have, then... choose a sampling plan whose LQL supports Risk-Management statements such as..."In IQC, mitigation will involve using a sampling plan that ensures that component lots that are 1% or more defective are rejected approximately 90% or more of the time"). If Risk Management docs do not identify IQC inspection as mitigating a product or process risk, then it is reasonable to conclude that IQC does not pose any risk to the end-user (e.g., patient or doctor); in that case, only "business risks" are important, and therefore any sampling plan is "valid". © 2008 by ZTC -- 74 74 74 74

“% Defective OC curve” for a 4% AQL sampling plan
Suppose that your Risk Management docs state that "consumer risk" is not acceptable if component lots are more than 5% defective, & claim that the IQC AQL attribute sampling plan shown below provides "mitigation" so that such lots are rejected 95% of time. Is that claim true? “% Defective OC curve” for a 4% AQL sampling plan NOT TRUE, because in order to ensure that "We are not accepting low quality product ", a 4 or 5% LQL sampling plan should be used, not this 4% AQL one. N=1000 n=80 c=7 However, even with LQL plans, we don't know exactly what level of "statistical confidence" we can claim for a given lot being inspected. © 2008 by ZTC -- 75 75 75 75

Do AQL plans control consumer risk consistently?
ASQC-Z1.4, general, level II, single, normal, 4% AQL Lot Sample If lot is 5% Defective Size Size " C " Pass Rate is... % % 1, % This shows a consistent approval rate of about 90%; but "5% defective" is at the AQL top of the OC curve (where "supplier risk" is controlled), & so is irrelevant to control of "consumer risk". © 2008 by ZTC -- 76 76 76 76

ASQC-Z1.4, general, level II, single, normal, 4% AQL Lot Sample If lot is 15% Defective Size Size " C " Pass Rate is... % % 1, % This shows an inconsistent approval rate at the LQL bottom of the OC curve (where "consumer risk" is controlled). © 2008 by ZTC -- 77 77 77 77

All three of these "4% AQL" plans have same high Pass Rate when lot is low % defective, but differ greatly at high % defective. “% Defective OC Curves” for ASQC-Z1.4, general, level II, single, normal, 4% AQL © 2008 by ZTC -- 78 78 78 78

ANOTHER KIND OF OC CURVE
Do AQL plans control consumer risk consistently? ANOTHER KIND OF OC CURVE “Lot Size OC Curves” for ASQC-Z1.4, general, level II, single, normal, 4% AQL Lot Quality is If product is 10% defective, the pass rate will go drastically down when the product goes from R&D lot sizes to Commercial size lots. © 2008 by ZTC -- 79 79 79 79

“Lot Size OC Curves” for v4 ASQC-C=0, Single Sample, 4% AQL
Do AQL plans control consumer risk consistently? Conclusion: Z1.4 and C=0 AQL sampling plans do not control consumer risk consistently, unless Lot Size is controlled. “Lot Size OC Curves” for v4 ASQC-C=0, Single Sample, 4% AQL Lot Quality is If the product is 10% defective, and you visit the supplier to initiate corrective action, and then you reduce lot sizes because of the problems, the pass rate magicly goes up! © 2008 by ZTC -- 80 80 80 80

Important lesson from 2 previous slides:
When the Receiving-QC Inspection Pass Rate changes dramaticly (increasing or decreasing), you should not come to any conclusion about the cause (e.g. "Supplier is doing much better!" or "Supplier is doing much worse!"), until you examine the "Lot Size OC Curve" versus the size of the lots that have been received before and after that Pass Rate changed dramaticly. Such an examination may reveal that the dramatic "change" is a false impression, and that it is due to a change in size of lots received, not due to a change in the quality of the lots received! © 2008 by ZTC -- 81 81 81

Arbitrariness(?) of Sampling Plans
** = ASQC-Z1.4, general, level II, single, normal, 4% AQL Lot Sample If lot is 5% Defective Size Size " C " Pass Rate is... 20 3** 0** 85 %** 100 20** 2** 95 %** 1, % 1,000 80** 7** 96 %** 1, % © 2008 by ZTC -- 82 82

** = ASQC-Z1.4, general, level II, single, normal, 4% AQL Lot Sample If lot is 5% Defective Size Size " C " Pass Rate is... 1, % 1,000 80** 7** 96 %** 1, % During World War II (when these sampling plans first became common), one possible use for the larger-than-needed sample sizes was for discrimination between mediocre lots and excellent lots (both of which “pass” QC). © 2008 by ZTC -- 83 83

All 3 plans from the previous slide have the same high Pass Rate when lot is 5% defective, but differ greatly at high % defective. Sample Size for lot size of 1000 Middle line is ASQC Z1.4 4% AQL " C " for that sample size © 2008 by ZTC -- 84 84

" Zero Acceptance Number Sampling Plans " ( by N. L. Squeglia, 5th ed
This table from this 5th edition, has many sample-size changes (shown circled), compared to the 4th edition. In some cases, the sample size has increased dramatically (e.g., if Lot Size = 100 and AQL=1.5, then...Sample Size is now 19 instead of 12, which is a 58% increase, and... the pass rate for a 1.5% defective lot of that size drops from ≈ 83% in 4th edition to ≈ 75% in the 5th edition). © 2008 by ZTC -- 85 85 85 85

How much defective product is in your approved-parts Storeroom?
The % of defective product that is in your Approved-parts Storeroom is a function of the... quality of lots received sampling plan used lot size received (as we saw on previous slides) A relevant term that defines that % is AOQ (Average Outgoing Quality). AOQ is the resulting average % defective in the Approved Storeroom, assuming that LotSize, SampleSize, “C” value, and received Lot%Defective all remain constant. If the received Lot%Defective varies from lot to lot, then the potential AOQ varies lot to lot. The worst AOQ possible is then called the “Average Outgoing Quality Limit” or AOQL. © 2008 by ZTC -- 86 86 86 86

AOQ can be easily calculated using a “classic” formula found in any sampling-plan textbook, but AOQL is typically available in tables in the back of published Sampling Plans. Squeglia "C=0" (4th ed.) This means that if a 1.0 % AQL Sampling plan is used, and if the parts Supplier consistently sends lots that are about 6 to 7% defective, and if Lot Size is consistently 91 to 150, then Approved Stores will consistently contain about 2.6% defective of that part. Therefore, unless only a specific small range of lot sizes (e.g., ) is allowed to be purchased, AOQL (and possibly storeroom quality) varies lot-to-lot!

% Defective in the Approved Storeroom is affected by IQC & Supplier-Control practices
The classic formula for AOQ assumes that good parts are used to replace all defective parts encountered either in any sample or in a 100% inspection of a rejected lot, before that lot is approved and moved into the Approved Storeroom. Based upon John Zorich’s experience auditing more than 200 US medical device manufacturing companies from 1999 to 2014, “no” company follows those “classic” instructions. Instead, virtually all companies do NOT replace defective parts with good parts, but rather return defective parts to the Supplier for credit on future shipments of normal lot sizes. If N=100, n=16, C=1, & received Lot%Defective = 10%, then... AOQ = 4.20% using the “classic” formula AOQ = 4.92% when good parts do NOT replace defectives. © 2008 by ZTC -- 88 88 88 88 88

Why do we do so much work for so little information?
When we use classic AQL attribute sampling plans, we settle for knowing almost nothing about the specific lot of product from which the sample came. If our boss were to ask, all we can say is "the lot passed". We don't know the % defective in the just-passed lot We may not know the actual % defective in Stores (we may know only theoretical worst case = the "AOQL") If all we do is focus on the AQL, we don’t even have a clear definition of a “bad lot” (the chosen "AQL%" is considered good-enough %defective) Instead, for each Lot of product received, why don't we calculate what % of that lot is "in-spec" ? That is… Why not calculate its “reliability” at 95% confidence ("reliability" here means "% in-specification"), and use % reliability specs (instead of % AQL specs)? © 2008 by ZTC -- 89 89 89 89 89

Using "Reliability Calculations" instead of AQL Sampling Plans
The future is now: Duke Empirical (a well-known contract Design & OEM manufacturer in Santa Cruz, California, with a long list of medical device clients, including billion-dollar corporations) does not use any AQL sampling plans for IQC, unless mandated by the client. Instead of %AQL specifications, Duke uses %Reliability specs (all at 95% Confidence), as described in this seminar. The client is asked to choose an IQC %Reliability spec for each of its parts received by Duke. If the client is not ready to do that, Duke defaults to the %Reliability listed by Risk class in Duke SOPs (e.g., human-implant parts are high risk).

Was that problem predictable / avoidable?
What difference does it make if we continue to AQL ? Here's a real-life example of “Sampling Plan Blues”: Using an AQL sampling plan ( n = 10, C = 0 ), actual (residual) data from receiving inspection QC of catheter nose-piece was... 2.82, 3.72, 3.91, 4.70, 4.77, 5.24, 5.71, 6.09, 6.28, Average = 5.04 Std Deviation = 1.33 Specification is " 2.50 or greater " Lot passed, because all sample nose-pieces were in-spec. However, more than 10% of finished devices made with that lot of nose-pieces failed final test, causing a week-long shut-down of production while root cause was investigated. And... the root cause of those failures was...out-of-spec nose-pieces !! Was that problem predictable / avoidable? Data was "Normal". Using “Normal" K-table (on next slide)... Observed K = (5.04 – 2.50 ) / 1.33 = 1.92 = < 90% reliability at 95% confidence

Juran's QH This is K for 95% confidence of 90% reliability when 10 is the sample size and the population is Normally distributed. Because the observed K (= 1.92, on the previous slide) is smaller than this value, we are 95% confident that the true reliability is less than 90% (that is, 1.92 is less than 2.355).

What this seminar proposes is this:
If we use sampling plans at Incoming QC to assess the quality of the purchased product, then traditional AQL attribute sampling plans are inadequate because ... they do not tell us the % of incoming product that meets product quality specifications; and therefore... they do not guarantee that our formal risk-management statements / requirements have been met. What AQL attribute sampling plans do provide is evidence that a given lot of product meets the requirements of the sampling plan, NOT that the given lot meets product quality requirements. In these days of ubiquitous access to computers and computer programs that can easily perform reliability calculations, we should, instead of AQL plans, use %Reliability + %Confidence specifications © 2008 by ZTC -- 93 93 93 93 93

What this seminar proposes is that we all start using a "New" kind of QC specification:
Product Design specifications Old QC Specs New QC Specs Sterile-barrier pouch 1 psi, minimum burst pressure 0.65% AQL 99% reliable at 95% confidence Injection-molded part inches long 1% 97% reliable at 95% confidence Label text color same as master copy 4% AQL 90% reliable at 95% confidence © 2008 by ZTC -- 94 94 94 94 94

(Process Validation Guidance, GHTF/SG3/N99-10:2004 (Edition 2)
Misleading advice from GHTF (Process Validation Guidance, GHTF/SG3/N99-10:2004 (Edition 2) 1% AQL sampling plan would NOT support that statement! A lot size of 300 would need an AQL ≈ 0.06 %, and a sample size of 189, C = 0 (as determined using StatGraphics-XV). © 2008 by ZTC -- 95 95 95 95

STATISTICAL PROCESS CONTROL ( SPC )

BACKGROUND on SPC "Statistical Process Control" was invented in the 1920s by an engineer working at Bell Labs. His name was Walter Shewhart. His goal was to increase the thru-put & reduce the scrap rate at the nearby telephone manufacturing plant, in order to meet the huge demand for telephones (which were new, hi-tech gadgets in those days). During June–August 1950, an American named Edward Deming trained hundreds of Japanese engineers, managers, and scholars in SPC, a tool that became the foundation for Japan’s success in becoming the world leader in product quality. In gratitude, Japan awards the annual “Deming Prize” to companies and individuals who have made major contributions to the advancement of quality. The awards ceremony was (as of 2011) still shown every year in Japan on national television!

"Quality" -- How can it be defined?
Which has higher quality: Rolls Royce or Ford Focus? Which are higher quality: paper clips or binder clips? Does a packet of "Sweet'n Low" have higher quality if it has more or less than the targeted 1.00 gram? What makes for higher quality of a "Sweet'n Low" packet? Lot to lot conformance (on average) to design/QC specs. ( This is sometimes called being “on target”.) Similarity, packet to packet to packet (within a lot). ( This is sometimes called having “minimum variation”.) The modern, practical definition of “quality” is: “on target with minimum variation”.

Basic SPC lingo: Term Meaning . Xbar Same as a mathematical "Average"
Range An indicator of how variable a process is, as shown by within lot variability Sigma Same as a Standard Deviation Sigma X = std dev of raw data Sigma Xbar = std dev of sample avgs Sigma R = std dev of sample ranges ( These are "standard errors" ) © 2008 by ZTC -- 99 99

Every process has some variation:
"Common" causes of variation appear routinely & randomly (e.g., the variation in results of honest dice or coin tossing); to start an investigation about such normal variation is to waste resources, because a common-cause change is a "false alarm" (as Shewhart called it). The AIAG reference manual calls them "the many sources of variation that consistently act on the process." Think of common causes as "background noise". "Special" causes of variation appear unexpectedly and definitely not randomly; you get a lot of "bang for your buck" when you try to identify and eliminate the cause of "special" variation, because there is "real change" & "the cause is findable" (as Shewhart described it). Special causes act inconsistently on the process Think of a special cause as a "signal " of opportunity.

What does SPC do ?? Statistical Process Control is poorly named. It really doesn't "control" anything. It should be known as "Statistical Process Monitoring". All SPC does is MONITOR a process for times of unexpected variation, which indicates to you when your company might benefit by spending resources to discover the cause of the unexpected variation (that is, to determine the identity of "special causes"). If SPC charts are used simply to decide when to adjust the process up or down, you sabotage your process!!

What does SPC do ?? SPC "Control Charts" help you to identify when a process is "out of control". By definition, a process is "out of control" when it has been affected by a "special cause" of variation and therefore is not predictable. If the special cause can be identified and eliminated, the process will return to "control" (i.e., to become "in-control”) and therefore will probably be less variable (= more predictable) in the future, since if there is no special cause acting on the process, then the only cause of variation is "common cause".

Way out of Spec, low !! IN CONTROL Lower Spec Target The shape and location of the distribution in the next lot ("Day 5") is predictable, and so the situation is "in control", even tho half of the product is out of spec!! Product that is "In control" is not necessarily "In spec". © 2008 by ZTC -- 104 104

-- OUT OF CONTROL -- Lower Spec -- Upper Spec The shape and/or location of the distribution of the next lot ("Day 5") is NOT predictable, and so the situation is "out of control", even tho all the product is in spec. "Out of control" product is not necessarily "Out of spec". © 2008 by ZTC -- 106 106

Basic types of Control Charts
Variables data ( = measurements ) are charted onto "XbarR" or "XbarS" or "XmR" control charts. Count data ( = 1, 2, 3, ...) are charted onto either " P " (or " NP ") or " U " (or " C ") control charts. Today, we'll discuss only XbarR charts. © 2008 by ZTC -- 107 107

Variables data entry ( n > 1 )
© 2008 by ZTC -- 108 108

XbarR Control Chart Variables Data, Statistical Process Control (SPC) Chart
bb dfsf This upper chart shows the "between-sample variation", i.e., variation from one sample Average (= Xbar) to the next. This lower chart shows the "within-sample variation", i.e., variation from one sample Range (= R) to the next (if plot Std Deviation here, then have XbarS chart).

Variables Data, Statistical Process Control (SPC) Chart
bb UCL (upper control limit) for Averages LCL (lower control limit) for Averages UCL (upper control limit) for Ranges LCL (lower control limit) for Ranges

bb dfsf The avg of the "current process" is drawn as the "midline" Data representing the "current process" are marked here with boxes; only these are used to calc limits & midlines. Notice that we didn't use some lots.

bb dfsf Out of control We'll talk about these later. NOT out of control

“Control Chart” per GHTF (Process Validation Guidance, GHTF/SG3/N99-10:2004 (Edition 2)
The GHTF would have been much more helpful had they added the word “sample” just before the words “average or range” d

“Control Chart” per GHTF ( FDA approved
“Control Chart” per GHTF ( FDA approved !!) (Process Validation Guidance, GHTF/SG3/N99-10:2004 (Edition 2) This is very definitely NOT an SPC Control Chart (the chairman of SG3 agreed, in an to J. Zorich, in 2012) d

(QC) Specification Limits
What is the difference between (QC) Specification Limits (a.k.a., "spec limits") and (spc) Control Limits © 2008 by ZTC -- 115 115

Spec Limits vs. Control Limits
Specification Limits ( USL & LSL ) are design or QC requirements; if the product is not within the Spec Limits, it is considered to be "bad" or "defective" product. Control Limits ( UCL & LCL ) are boundaries inside of which you can expect to see almost 100% of Sample Avgs & Ranges, in the current process, assuming it is " in control ". The control limits are a graphical indication of what your current process can do. In effect, control limits are set at +/– 3 std errors of the averages & ranges of the samples. © 2008 by ZTC -- 116 116

Control Limit Calculations
How to calculate the upper and lower control limits for the various types of basic charts is shown on the following slides. However, the data to use in the calculation is to be taken from the lots (batches) that YOU choose. You might choose to use all the data you have, or use just the first 30 or 50 lots, or use lots 73 thru 129. You should choose the lots that are "relevant" to the current process. That is, lots (i.e., data) that represent the current production process. For example....

The lots marked with squares were used to calculate the control limits.
12 11.5 11.0 10.5 10.0 9.5 9.0 1 2 In effect, the control limits are set at +/– 3 std errors from the midline (i.e., the distance between the control limits is 6 standard errors wide), calculated indirectly, using tables. Control limits on the "Averages" chart are equivalent to +/– 3 std errors of the mean. Control limits on the "Ranges" chart are equivalent to +/– 3 std errors of the range.

XbarR Chart, (n = 2 or more)
For sample averages: UCL = AvgAvg + ( A2 x AvgRange ) AvgAvg = Average of all chosen measurements LCL = AvgAvg – ( A2 x AvgRange ) For sample ranges: UCL = AvgRange x D4 AvgRange = Average of all chosen ranges LCL = AvgRange x D3 "Factors" from table on next slide...

n An error in the textbook !! " n " is the size of a single sample, not the sequence # of the sample, nor the # of samples !! [ this is a scanned image from "Understanding Statistical Process Control" (2nd ed.) by Wheeler & Chambers ]

Class exercise (XbarR chart):
Where should you draw the upper and lower control limits for Averages and Ranges, if... # of data pts per Sample Avg = 9 # of Sample Avgs = 7 Average of all 7 Avgs = 100 Average of all 7 Ranges = 10 Answer: UCLavg = (10 x 0.337) = LCLavg = 100 – (10 x 0.337) = UCLrange = 10 x = LCLrange = 10 x = 1.84 Factors from table

The lots circled in blue were used to calculate the control limits.
Lots circled in blue are shown in the histogram and were used to calculate the control limits. The lots circled in blue were used to calculate the control limits. LCL(avg) = 6.5 UCL(avg) = 13.5

Lots circled in blue are shown in the histogram and were used to calculate the control limits. The lots circled in blue were used to calculate the control limits. LCL (avg) = 7.5 LCL (avg) = 12.5

Lots circled in blue are shown in the histogram and were used to calculate the control limits. The lots circled in blue were used to calculate the control limits. LCL(avg) = 9.5 UCL(avg) = 10.5

The purpose of SPC is to help processes move from left to right!
Because the variation in both Averages and Ranges has been greatly reduced, the process on the far right is making higher quality product than the process on the far left, regardless of what % of product is made "in spec". The purpose of SPC is to help processes move from left to right! The goal of an SPC program is to make product that is more "on target" with "minimum variation".

How is "Out of Control" detected?
OUT OF CONTROL = any data point or set of points (on the control chart ) that would have little likelihood of occurring by chance alone, assuming the data is normally distributed ( "out of control" = "special cause" is present). YOU decide what " little likelihood " means. Over 80 years ago, that meant a probability of 1 in 20 , whereas current preference is 1 in 370. Whenever you see "out of control" on a control chart, you should investigate the reason, to try to determine root cause (i.e., to identify and react to "special cause“); otherwise, you are wasting your company's time by making SPC charts.

Red-circled points indicate "out of control" situations
bb Out of control " trend " Out of control point NOT out of control Out of control " series "

"Rules" for detecting "out of control" (all taken from SPC textbooks)
Probability of occurring by chance (assuming no "special cause" is acting) 1 point outside either control limit 1 in 370 9 in sequence on one side of midline 1 in 256 9 in an ascending or descending trend 1 in 256 (on avg) 8 in sequence on one side of midline 1 in 128 10 of 11 on same side of midline 1 in 102 12 of 14 on same side of midline 1 in 105 14 of 17 on same side of midline 1 in 117 16 of 20 on same side of midline 1 in 135 many, many others !! 1 in 100 to 400 !!!

Reasons not to have too many rules...
USING THE ROLL OF 4 DICE AS AN EXAMPLE The chance of not getting a 6 on any die is 5/6 x 5/6 x 5/6 x 5/6 = approximately 50 % ; therefore, about 50% of the time, a toss of 4 dice will have a 6 showing on at least 1 of the dice. USING ALL THE RULES ON THE PREVIOUS SLIDE 369/370 x 255/256 x 255/256 x 127/128 x 101/102 x 104/105 x 116/117 x 134/135 = 95 % ; therefore, about 5 % of the time ( = 1 out of every 20 times), a point will be called "out of control" even tho it is the result of random variation ( = "common cause"; that is, NOT "special cause"). That may be too frequent for the boss's taste !!! USING ONLY (the first) 3 RULES 369/370 x 255/256 x 255/256 = 99 %, which means only 1% or about 1 in a hundred times will a "false alarm" be triggered by chance (is that more acceptable to your boss ??). That % is recommended in the Handbook of Statistical Methods in Manufacturing (R. B. Clements, 1991).

Random vs. Representative Sampling
The data on your SPC chart should faithfully represent the process you're trying to improve. To acquire representative samples, you could either… Wait till the end of the production run, and then randomly choose a sample from output of the entire run, OR... Take one item per arbitrary time period (e.g., one part every hour, or one part after each 100 parts made, or ??) and combine all the collected parts as a "representative" sample. If your boss makes you take only the first few parts from a day's run, don't argue -- it's better than having no SPC program !! © 2008 by ZTC -- 130 130

Rational Sub-grouping of Samples
Basic rule for good SPC charts: Plotted points must have NO KNOWN SYSTEMATIC SOURCE OF VARIATION between them, other than sequential production over time. For example: Manufacturing occurs on day, swing, and graveyard shifts. The average of a sample from each shift's production is plotted sequentially on the same SPC chart. IS THIS GOOD OR BAD (see next slide)? © 2008 by ZTC -- 131 131

Rational Sub-grouping
This is BAD !! Evening, Night

This is GOOD !!

Basic rule for good SPC charts: Plotted points must have NO KNOWN SYSTEMATIC SOURCE OF VARIATION between them, other than sequential production over time. If you ignore that rule, you may miss chances for valuable investigations of "special cause" incidents, or you may waste time investigating "common cause" effects. Corollary: If your production process has not yet been standardized (equipment, procedures, raw materials, etc.) then it's too early for SPC. CONTROVERSY: This instructor agrees with authors who state that SPC can be initiated even if process is out-of-control from the start. © 2008 by ZTC -- 134 134

Sample Size = n n = quantity of product chosen for each sample that is plotted as a single point on a control chart (that is, " n " is the "sample size" within each single point on the chart). Historically, sample size has been a choice based on ease of calculation (but with computers or calculators, this reason is not important). Theoretically, can be any value ( n = 1 or higher). However, if n = 1, then cannot evaluate "within sample" variation !! Practically, if your raw data is not "normal", you should start with a large sample (e.g., n = 10 ) in order to take advantage of the Central Limit Theorem (= avgs of "large" samples tend to be normally distributed). To ensure you have the best chance for improving the process, have n = 7 or more. © 2008 by ZTC -- 135 135

Harold F. Dodge: Worked in the quality assurance department at Bell Laboratories from 1917 to 1958.
Of his early experiences Dodge wrote: "There have been several things of special interest in my work in this field over the years. It all goes back to the beginnings of statistical quality control in Our work in cooperation with shop engineers was influenced heavily by great pressures to save money and to make the quality control methods simple and easy to use. Initially, the basic procedures for variables called for samples of four, with one chart for the average, and another for the standard deviation, Shop reaction was prompt against anything as complicated as computing the standard deviation. After some study we proposed the use of the range, R. On top of that we proposed shop use of samples of five instead of four; it is easier to divide by five than by four. These simplifying steps quickly became the basis for shop practice." That text is from:

HOW DOES AN SPC PROGRAM WORK?
Identify which important or troublesome steps in the QC, assembly, or manufacture process are to be in the SPC Program, and set up a separate SPC control chart to record data on each such step. Monitor the control charts, looking for out-of-control points or out-of-control series or trends. Investigate "out of control" situations, to discover their cause. This may require not only technical studies of product, process, or equipment, but also interviews with assemblers and production personnel, and possibly even "brainstorming" sessions or other such techniques involving personnel from all relevant departments --- in other words, this will consume a lot of time !! Devise + implement changes to product, process, documents, equipment, personnel, or environment, as needed to eliminate or reduce the effect of identified causes of variation.

HOW DOES AN SPC PROGRAM WORK?
Recalculate (or manually re-set) the SPC control limits whenever the variation between lots or within lots has been significantly reduced. This is done so that, assuming the process is in control, about 1 in 370 data-points occurs outside the control limits of the now less variable process!! Go back to step 2 above !! Continue these cycles as long as the reduction in variation is worth the expense ( $$$ ) of the SPC effort ( = time & other resources ). CONTROVERSY: Some textbooks state that SPC should not be initiated until a process is "in control". The instructor agrees with other authors (e.g., Wheeler and Chambers) who advise using SPC for any process, even if it is currently "out of control“, because SPC can then be used to help improve the process (that is, to help get it "in control").

Capability Indices This topic typically applies only to "variables" data. It's application to "count" data is not discussed here. Because these indices have no "confidence" statement associated with them, the sample size chosen is irrelevant (except for the fact that the larger the sample size, the closer your result is likely to be to the "true" answer, as is concluded from the "Law of Large Numbers"). Because of the lack of a confidence statement, John Zorich prefers to use “confidence/reliability” statements (as were taught during Day 2 of this workshop) rather than these Capability Indices; but that is a personal choice. © 2008 by ZTC -- 140 140

CAPABILITY INDICES In the following slides...
n = Sample Size used to calculate each dot on the SPC chart USL = Upper (QC) Specification Limit LSL = Lower (QC) Specification Limit NSL = Nearest Spec Limit (i.e., whichever of the USL or LSL is nearest to the process average, i.e., nearest to the midline of the Xbar chart) UCL = Upper Control (chart) Limit of Avgs LCL = Lower Control (chart) Limit of Avgs "Sigma X" = standard deviation of raw data (or of "transformed" data)

Capability Indices assume that the raw data is "Normally Distributed".
Here, Sigma X Sigma Sigma X In a "normal distribution, virtually all of the raw data (99.73% of it) is in a range that is 6 "Sigma X" long = ( Avg +/– 3 "Sigma X") © 2008 by ZTC -- 142 142

6 times "Sigma X" = "6Sigma" To calculated a Capability Index, you need to know how large the value "6Sigma" is. 6Sigma can be calculated 2 ways: INDIRECT METHOD: = ( UCL – LCL ) x ( Sqrt ( n ) ) (where n = sample size used to calculate each dot on the SPC chart) DIRECT METHOD: = 6 x Standard Deviation of the raw data (e.g., using Excel's "=stdev" function)

= ( USL – LSL ) / (6Sigma [indirect])
Capability Indices Cp = ratio of the width of the specification limits to the SPC-estimated width of the range that the encompasses 99.7% of the product population. = ( USL – LSL ) / (6Sigma [indirect]) The larger Cp is, the better, because large numbers indicate that a large % of the product might lie within the Upper and Lower spec limits ( = USL & LSL), that is, a large % might pass QC. NOTE: If use "6Sigma [direct]", instead of "6Sigma [indirect]" you're actually calculating Pp, not Cp.

Capability Indices Cp This is useful only if the average data value is currently near the specification target. If the average data value is not near the target spec, then this gives a false indication of % in-spec. Cp is used mostly to indicate what % of product might be in-spec, IF the average data value were near the specification target (that is, if the midline of the control chart was identical to the mid-point of the specification range).

What % is "in spec" 0.33 = 1 "Sigma X" away 67.8 % 0.67 95.6 %
Assuming that the data is distributed "normally" and the process is centered on the QC Specs, Cp and Pp indicate the following: Cp or Pp Value % Product within Specification 0.33 = 1 "Sigma X" away 67.8 % % 1.00 = 3 "Sigma X" 99.7 % % % 2.00 = 2 x 3 "Sigma X" % = "Six Sigma" (slang for "Six Sigma X") "Capable Process" is often defined as Cp = or larger. d

= 2 x |( NSL – AvgValue )| / (6Sigma [indirect])
Capability Indices Cpk = 2 times the distance (as a positive number) that the AVG (data) VALUE is from the nearest spec limit, divided by "6 x SigmaX [ indirect ]": = 2 x |( NSL – AvgValue )| / (6Sigma [indirect]) The larger Cpk is, the better, because large numbers indicate that a large % of the product does pass QC. NOTE: If use " 6Sigma [direct] ", instead of "6Sigma [indirect]" you're actually calculating Ppk, not Cpk.

Capability Indices Cpk is useful no matter whether the avg data value is currently near the specification target or not; i.e., even if the average data value is not near the target spec, Cpk still gives good indications of % in-spec. That % in-spec would be higher IF the average data value were nearer the spec target (Cp gives an indication of that higher %). Express Cpk as a negative number, only if the "Avg Value" is outside the spec limits. Most companies claim to be calculating Cp & Cpk, but an examination of their formulas reveals that they are really calculating Pp & Ppk !!!

Classroom exercise Calculate the Cp, based upon this data, using the equations given on the previous slides... = n = Sample Size 130 = USL = Upper (QC) Specification Limit 70 = LSL = Lower (QC) Specification Limit 105 = UCL = Upper Control (chart) Limit of Avgs 95 = LCL = Lower Control (chart) Limit of Avgs Answer: Cp = (USL – LSL) / (6Sigma[indirect]) (USL – LSL) = 130 – 70 = 60 "6Sigma..." = ( 105 – 95 ) x sqrt( 9 ) = 10 x 3 = 30 Cp = 60 / 30 = 2.00

What % is "in spec" For a given set of data, Cpk and Ppk values are always smaller than or equal to Cp & Pp, respectively (they can never be larger). Cpk & Ppk give a picture of the actual current situation; whereas Cp & Pp give a picture of what could be, IF the process were centered on the specification target. Cp and Pp cannot be calculated if there is only a one-sided spec, but Cpk and Ppk can. The % associated with a given Cpk or Ppk value depends on what the specs are. See STUDENT file "CpCpkPpPpk Percent In-spec"

Cp = Cpk Cp = Cpk

Cp Cpk

Cp = Cpk Cp = Cpk

If QC specs limits are here, If QC specs limits are here,
histogram of raw data If QC specs limits are here, then Cpk = 2.00 If QC specs limits are here, then Cpk = 1.00

histogram of raw data AFTER histogram of raw data BEFORE
It is obvious that... a Cpk of 2.00 is much better than a Cpk of 1.00 !! histogram of raw data AFTER process improvements histogram of raw data BEFORE process improvements SPC helps to get you from here to here If QC specs limits are here, then Cpk = 1.00 If QC specs limits are here, then Cpk = 2.00

Non-normal Data (not transformed)
Raw data Cpk = 0.97 ≈ 3 parts per 2000 are predicted to be out-of-spec Spec limits = 0.07 to 0.15

Non-normal Data, TRANSFORMED
Raw data transformed ( 1 / X ) Cpk = 1.09 ≈ 1 part in 2,000 is actually out-of-spec Transformed Spec limits = 6.7 to 14.3 Therefore, you predicted 3 times more defective product ( 3 / vs. 1 / 2000) when you assumed (incorrectly) that the untrasformed data was normally distributed.

In summary for this course: How to implement what you've learned?
Be patient (no one wants to talk about statistics!!) Gather data (Make observations, calculations, and charts that you design to be convincing to your MANAGEMENT. Try to relate to something they consider important. In a "start-up" company, that might be "time to market" or a successful product launch. In an established firm, that might be scrap-rate or labor-savings, i.e., $$$$.) Present your ideas at the right time in the right setting (a tense meeting about a product problem might not be the best time to talk about statistics --- it might be better to wait until after the meeting, and then tell your boss in private). © 2008 by ZTC -- 158 158

Applied Statistics for Advanced Applications

Similar presentations

Presentation on theme: "Applied Statistics for Advanced Applications"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Applied Statistics for Advanced Applications

Similar presentations

Presentation on theme: "Applied Statistics for Advanced Applications"— Presentation transcript:

Similar presentations

About project

Feedback