Presentation is loading. Please wait.

Presentation is loading. Please wait.

Comparing Distributions I: DIMAC and Fishers Exact By Peter Woolf University of Michigan Michigan Chemical Process Dynamics and Controls.

Similar presentations


Presentation on theme: "Comparing Distributions I: DIMAC and Fishers Exact By Peter Woolf University of Michigan Michigan Chemical Process Dynamics and Controls."— Presentation transcript:

1 Comparing Distributions I: DIMAC and Fishers Exact By Peter Woolf (pwoolf@umich.edu) University of Michigan Michigan Chemical Process Dynamics and Controls Open Textbook version 1.0 Creative commons

2 Scenario: You run a small plastic factory described in an earlier lecture You have already developed the P&ID, control architecture, and parameterized your controllers. The system is running well most of the time, but not always. Generally you get a 30% yield, but not always. If the yield is above 32% or below 28% then the batch can’t be sold. How do you tell if the system is out of control? What do you do if it is out of control? What strategies can you adopt to maintain tighter control?

3 DMAIC: Define, measure, analyze, improve, and control Goal: Consistent yield Measure yield Control charts, detective work Change system and/or policies

4 How do you tell if the system is out of control? 1) Make some measurements

5 How do you tell if the system is out of control? 2) Construct a control chart Statistically out of control because run 9 exceeds the UCL Now what??

6 Log it and do nothing. Wait for it to happen again before taking action –Note lost opportunity to improve process, and possible safety risk. Passive solution What if you are out of control?

7 Resample to make sure it is not an error –Odd that this is not done when things are okay.. Adjust calculated mean up or down to adjust to the new situation –Treat the symptom, not the cause –Lost opportunity to learn about the process Semi-passive solutions

8 Look for a special cause and remove or enhance it. –Not all changes are bad, some may actually improve the process. What if you are out of control? Active solution

9 Look for a special cause Possible sources of information: 1)Patterns in the data 2)Association with unmeasured events 3)Known physical effects 4)Operators Field observation: “The feed for run 9 seemed unusually runny--maybe that is the reason?”

10 Hypothesis: Runny feed causes the product to go out of our desirable range. 1)Gather data 2)Evaluate hypothesis 3)Make a model of the relationship (1) Is this significant? (2) What causes the feed to be runny? (3) Can we develop strategies to cope with this? Data from 25 runs 5 Normal feed Runny feed Bad product Good product 118 1

11 Marginal results (sums on the side that count over one of the states) Is this significant? --> What are the odds? 2 answers depending on the question: (1)What are the odds of choosing 25 random samples with this particular configuration (2)What are the odds of choosing 25 samples with these marginals in this configuration or more extreme? 5 Normal feed Runny feed Bad product Good product 118 16 19 625 totals

12 5 Normal feed Runny feed Bad product Good product 118 16 19 625 totals What are the odds of choosing 25 samples with these marginals in this configuration or more extreme? What are the odds? Urn Remove 6 balls Restate as an urn problem: with 25 balls, 6 are white and 19 black, what are the odds of drawing 6 balls of which 5 are white and 1 is black? Break down the problem: For the 6 bad products, odds of 5 with runny feed, 1 normal?

13 Urn Remove 6 balls Restate as an urn problem: with 25 balls, 6 are white and 19 black, what are the odds of drawing 6 balls of which 5 are white and 1 is black? Number of ways of choosing 5 out of 6 of the white balls Number of ways of choosing 6 out of 25 balls Number of ways of choosing 1 out of 19 of the black balls Odds of this draw

14 Urn Remove 6 balls Restate as an urn problem: with 25 balls, 6 are white and 19 black, what are the odds of drawing 6 balls of which 5 are white and 1 is black? Odds of this draw Hypergeometric distribution: probability sampling exactly k special items in a sample of n from an urn containing N items of which m are special where Reads “a choose b”

15 5 Normal feed Runny feed Bad product Good product 118 16 19 625 totals What are the odds of choosing 25 samples with these marginals in this configuration or more extreme? What are the odds? Analogous arguments can be made for: 1 in 19 of the good products having runny feed 1 in 6 of the runny feed products being good products 1 in 19 of the normal feeds being bad product Composite probability can be calculated using Fisher’s exact test

16 Fisher’s exact is the probability of sampling a particular configuration of a 2 by 2 table with constrained marginals a Normal feed Runny feed Bad product Good product cd ba+b c+d b+da+ba+b+c+d totals # of ways the marginals can be arranged # of ways the total can be arranged # of ways each observation can be arranged

17 5 Normal feed Runny feed Bad product Good product 118 16 19 625 totals What are the odds of choosing 25 samples with these marginals in this configuration? In Mathematica: But this is for this configuration alone! Is this one of many bad configurations?

18 5 Normal feed Runny feed Bad product Good product 118 16 19 625 totals What are the odds of choosing 25 samples with these marginals in this configuration? Probability estimate at a particular value Estimate at a value or further Or more extreme values.. One tail test..

19 5 Normal feed Runny feed Bad product Good product 118 16 19 625 totals What are the odds of choosing 25 samples with these marginals in this configuration or more extreme? A more extreme case with the same marginals 6 Normal feedRunny feed Bad product Good product 019 06 625 totals P fisher =0.00064 P fisher =0.0000056 P-value = 0.00064+ 0.0000056 =0.0006456

20 P-values P-values can be interpreted as the probability that the null hypothesis is true. Null hypothesis: Most common interpretation is completely random event, sometimes with constraints Examples of null hypotheses: Runny feed has no impact on product quality Points on a control chart are all drawn from the same distribution Two shipments of feed are statistically the same Often p-values are considered significant if they are less than 0.05 or 0.001, but this limit is not guaranteed to be appropriate in all cases..

21 Look for a special cause 5 Normal feed Runny feed Bad product Good product 118 16 19 625 totals 1) Data 2) Analysis: p-value=0.00064<0.05 3) Conclusion: runny feed significantly impacts product quality Note: Runny feed is not the only cause as sometimes we get good product from runny feed..

22 Look for a special cause 3) Conclusion: runny feed is likely to impact product quality What next? Look for root causes: What causes runny feed? Supplier? Temperature? Storage conditions? Lot number? Storage time? - very process dependent Develop a method to detect runny feed before it goes into the process

23 Take Home Messages After you identify a system is out of control, take appropriate action Associations between variables can be identified using Fisher’s exact tests and its associated p-value Once the cause of a disturbance is found, find a way to eliminate it


Download ppt "Comparing Distributions I: DIMAC and Fishers Exact By Peter Woolf University of Michigan Michigan Chemical Process Dynamics and Controls."

Similar presentations


Ads by Google