Presentation is loading. Please wait.

Presentation is loading. Please wait.

EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005

Similar presentations


Presentation on theme: "EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005"— Presentation transcript:

1 EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
Dr. John Lipp Copyright © Dr. John Lipp

2 Copyright  2003-2005 Dr. John Lipp
Course Outline Part 1: Rank (Order) and Non-Parametric Statistics. Part 2: Statistical Process Control. Part 3: Reliability. Mid-term Exam. EMIS7300 Fall 2005 Copyright  Dr. John Lipp

3 Copyright  2003-2005 Dr. John Lipp
Today’s Topics Empirical Cumulative Distribution Function Rank Transform. Sign Test. Tukey’s Two Sample Quick Test. Circular Error Probability Confidence Interval. EMIS7300 Fall 2005 Copyright  Dr. John Lipp

4 Non-Parametric and Rank Statistics
Non-parametric statistical procedures are designed without the use of the underlying data distribution and its parameters. The only assumption is the data samples are statistically independent and come from the same distribution. Also known as distribution-free. Hypothesis tests and confidence intervals are on the median, quartiles, percentiles, or other quantiles. EMIS7300 Fall 2005 Copyright  Dr. John Lipp

5 Empirical Cumulative Distribution Function
The sample CDF or empirical CDF is defined by is equivalent to sorting the data yi = sort(xi) and plotting yi vs. i/n as a stair-step function. EMIS7300 Fall 2005 Copyright  Dr. John Lipp

6 Empirical Cumulative Distribution Function (cont.)
The sample PDF is an unbiased estimator The variance is given by The calculations of the expected values are actually easy The distribution of #xi < x has a binomial distribution with p = F(x) !!! EMIS7300 Fall 2005 Copyright  Dr. John Lipp

7 Empirical Cumulative Distribution Function (cont.)
Print / make transparency EMIS7300 Fall 2005 Copyright  Dr. John Lipp

8 Copyright  2003-2005 Dr. John Lipp
Rank Transform The rank transform is simply replacing the data with the data’s ranks from sorting in ascending order. Using the ranks often simplifies calculations: Rank Sample Mean (unaffected by ties) Rank Sample Standard Deviation (affected by ties) EMIS7300 Fall 2005 Copyright  Dr. John Lipp

9 Copyright  2003-2005 Dr. John Lipp
Sign Test Consider a hypothesis test on the median of a data set {xi} The test is performed by subtracting C from the data {xi} and taking the sign, {si} = {sign(xi – C)}. The number of “+” and “–” values of {si} are counted, denoted r+ and r–, respectively. xi 13.5 9.8 11.4 12.2 7.9 8.6 9.1 10.6 11.3 10.1 xi-10 3.5 -0.2 1.4 2.2 -2.1 -1.4 -0.9 0.6 1.3 0.1 si + EMIS7300 Fall 2005 Copyright  Dr. John Lipp

10 Copyright  2003-2005 Dr. John Lipp
Sign Test (cont.) What is the distribution of r+? r+ is a discrete random variable. r+ can be thought of as the count of successful “+”s in {si}. This success rate is a constant, p = 0.5. Ergo, r+ has a binomial distribution, EMIS7300 Fall 2005 Copyright  Dr. John Lipp

11 Copyright  2003-2005 Dr. John Lipp
Sign Test (cont.) For n large (n >> 10), can use a Z test with Otherwise, a table built from the binomial PDF is needed Two-sided One-sided Acceptance Region r+ = 5 4  r+  6 3  r+  7 2  r+  8 1  r+  9 0.754 0.344 0.109 0.022 0.002 Acceptance Region r+  3 r+  4 r+  5 r+  6 r+  7 r+  8 r+  9 0.828 0.623 0.377 0.172 0.055 0.011 0.001 EMIS7300 Fall 2005 Copyright  Dr. John Lipp

12 Copyright  2003-2005 Dr. John Lipp
Sign Test (cont.) The sign test can be used to test any quantile p, F(p) = p. The null hypothesis test is H0: p = C. The distribution of the test statistic r+ is binomial for p, Example: test H0: first quartile = 8.5 (q1 = 0.25= 8.5) xi 13.5 9.8 11.4 12.2 7.9 8.6 9.1 10.6 11.3 10.1 xi-8.5 5.0 1.3 2.9 3.7 -0.6 0.1 0.6 2.1 2.8 1.6 si + Acceptance Region 7  r+  8 6  r+  9 5  r+  10 0.4682 0.1344 0.0197 EMIS7300 Fall 2005 Copyright  Dr. John Lipp

13 Tukey’s Two-Sample Quick Test
Plot two data samples {xi} and {yi} on the same graph, using a different symbol for each point. Count the number of points of {xi} that protrude past {yi} at one end, and the number of points of {yi} that protrude past {xi} at the opposite end. The total is denoted the end-count. If {xi} protrudes at both ends, or visa-versa, then the end-count is 0. EMIS7300 Fall 2005 Copyright  Dr. John Lipp

14 Tukey’s Two-Sample Quick Test (cont.)
Use the table below for the significance level Confidence level, 1 – a, is the chance a difference in the medians exists between {xi} and {yi} (or their means, if the PDF is symmetric). Sample Size n  = 0.05  = 0.01  = 0.001 4-8 7 9 13 9-21 10 22-24 14 25+ 8 EMIS7300 Fall 2005 Copyright  Dr. John Lipp

15 Circular Error Probability
Circular Error Probability, or CEP, is specified in many weapon system’s requirements. The CEP is the median, radial miss distance. The standard model for the radial miss distance is the Rayleigh distribution. FR(R) CEP R fR(R) R EMIS7300 Fall 2005 Copyright  Dr. John Lipp

16 Circular Error Probability (cont.)
The appropriateness of the Rayleigh radial miss distance model tends to decrease as the system complexity increases. Point Estimator: the sample median. Need the distribution of R to analyze! Solution: Use non-parametric methods! Confidence Interval: sort the sample radial miss data so that R1  R2  R3  …  Rk  …  Rn and find the value of k such that P(CEP  Rk) = 1 – a Finding the appropriate value of k takes a little manipulation. EMIS7300 Fall 2005 Copyright  Dr. John Lipp

17 Circular Error Probability (cont.)
First, let m be the index of the largest radial miss that is less than or equal to the population median (= CEP) 1< m < n: R1  R2  R3  …  Rm  CEP  …  Rn, or m = 0: CEP  R1  R2  R3  …  Rn, or m = n: R1  R2  R3  …  Rn  CEP The PDF of m, fM[m], is binomial with p = ½ ! The radial miss distances are assumed to be statistically independent. The probability that a particular radial miss distance is less than the median is a constant ½ (by definition). m is the # of radial miss distances less than the median. EMIS7300 Fall 2005 Copyright  Dr. John Lipp

18 Circular Error Probability (cont.)
The desired probability can be rewritten using the total probability rule: Evaluate P(CEP  Rk|m) If m < k: P(CEP  Rk|m) = 1 R1  R2  R3  …  Rm  CEP  … Rk … Rn If m = k: P(CEP  Rk|m) = 1 R1  R2  R3  …  CEP  Rk  … Rn If m > k: P(CEP  Rk|m) = 0 R1  R2  R3  …  Rk  …  Rm  CEP  …  Rn EMIS7300 Fall 2005 Copyright  Dr. John Lipp

19 Circular Error Probability (cont.)
That is, and thus A similar result holds for a two-sided confidence interval EMIS7300 Fall 2005 Copyright  Dr. John Lipp

20 Circular Error Probability (cont.)
A one-sided test for n = 10, If the desired value of  is not on the table, linear interpolation can be used: where k    k-1. k 1 2 3 4 5 6 7 8 9 10 1-k 0.001 0.01 0.05 0.17 0.38 0.62 0.83 0.95 0.99 0.999 1.0 k 0.0 EMIS7300 Fall 2005 Copyright  Dr. John Lipp

21 Circular Error Probability (cont.)
i Raw Data Sorted Data k <0.0001 The data on the left is Rayleigh with a median of 2ln(2)  The sample median is Select  = n = 16. Looking at the table, k = 11. Using the interpolation formula, CEP  R R10 Final result: CEP  1.630 with 95% confidence. EMIS7300 Fall 2005 Copyright  Dr. John Lipp

22 Copyright  2003-2005 Dr. John Lipp
Homework Use the rank transform on the time data for the Hot Wheels launcher experiment and repeat the regression analysis for HW S2P4-1  modify your Excel spreadsheet to use the ranks instead of the raw data. EMIS7300 Fall 2005 Copyright  Dr. John Lipp


Download ppt "EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005"

Similar presentations


Ads by Google