# ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.

## Presentation on theme: "ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics."— Presentation transcript:

ISE 261 PROBABILISTIC SYSTEMS

Chapter One Descriptive Statistics

Engineering Statistics Collect Data Summarize Draw Conclusions

Data Types Categorical (Qualitative) > Attribute Variable (Quantitative)

Population Defined collection or group of objects

Census Data is available for all objects in the population

Sample Subset of the population

Variable Any characteristic whose value may change from one object to another in the population

Empirical Data Based on Observation

Data Collection Basic Principles of Design: Replication Randomization Blocking

Descriptive Statistics Graphical (Visual) Numerical

Graphical Stem-and-Leaf Displays Dotplots Histograms Pareto Diagram Scatter Diagrams

Numerical Mean Median Trimmed Means Standard Deviation Variance Range

Stem-and-Leaf Displays Data Format: > Numerical > At Least Two Digits Stem-and-Leaf Displays Data Format: > Numerical > At Least Two Digits

Information Conveyed: > Identification of a typical value > Extent of spread about typical value > Presence of any gaps in the data > Extent of symmetry in the distribution > Number and location of peaks > Presence of any outlying values Information Not Displayed: > Order of Observations Information Conveyed: > Identification of a typical value > Extent of spread about typical value > Presence of any gaps in the data > Extent of symmetry in the distribution > Number and location of peaks > Presence of any outlying values Information Not Displayed: > Order of Observations

Construction of Stem-and-Leaf: >Select 1 or more leading digits for stem values. The trailing digits becomes the leaves. >List possible stem values in a vertical column >Record the leaf for every observation beside the corresponding stem >Label or indicate the units for stems and leaves someplace in the display

DOTPLOTS Data Format: Numerical Distinct or Discrete Values Information Conveyed: Location Spread Extremes Gaps Construction: Each observation is a dot Stack dots above the value on a horizontal scale

Dotplot Example Data Set: Temperatures F 0 84 49 61 40 83 67 45 66 70 69 80 58 68 60 67 72 73 70 57 63 70 78 52 67 53 67 75 61 70 81 76 79 75 76 58 31 Dotplot Example Data Set: Temperatures F 0 84 49 61 40 83 67 45 66 70 69 80 58 68 60 67 72 73 70 57 63 70 78 52 67 53 67 75 61 70 81 76 79 75 76 58 31

Histograms (Pareto) Data Format: Qualitative (Categorical) Frequency: Number of times that a data value occurs in the data set. Relative Frequency: A proportion of time the value occurs.

Constructing a Pareto Histogram > Above each value (label), draw a rectangle whose height corresponds to the frequency or relative frequency of that value. > Ordering can be natural or arbitrary (eg. Largest to smallest).

Pareto Histogram Example During a week’s production a total of 2,000 printed circuit boards (PCBs) are manufactured. List of non-conformities: Blowholes = 120 Unwetted = 80 Insufficient solder = 440 Pinholes = 56 Shorts = 40 Unsoldered = 64 Improvements, Efforts, Time/Money?

Histograms Data Format: >Numerical >Discrete or Continuous Data displayed by magnitude. Observed frequency is a rectangle. Height corresponds to the frequency in each cell.

Histogram Construction Discrete Data: >Find Frequency of each x value >Find Relative Frequency >Mark possible x values on a horizontal scale >Above each value, draw a rectangle whose height corresponds to the frequency or relative frequency of that value

Histogram Construction Continuous Data: (Equal Widths) > Count the number of observations (n) > Find the largest & smallest (n) > Find the Range (largest- smallest) > Determine the number and width of the class intervals by the following rules:

Rules > Use from 5 to 20 intervals. Rule of Thumb: # of Intervals = √n > Use class intervals of equal width. Choose values that leave no question of the interval in which a value falls. > Choose the lower limit for the first cell by using a value that is slightly less than the smallest data value. > The class interval (width) can be determined by w = range/number of cells.

Build Histogram Continuous Data: > Tally Data for each Interval > Draw Rectangular Boxes with heights equal to the frequencies of the number of observations.

Histogram Shapes Unimodal (1 single peak) Bimodal (2 different peaks) Multimodal (more than 2 peaks) Symmetric (mirror image) Positively Skewed (R-stretched) Negatively Skewed (L-stretched) Uniform (straight) Truncated (limited)

Scatter Diagrams Data Format: Continuous Two Random Variables Construction: Each Ordered Pair is plotted Patterns: Positive Correlation No Correlation Negative Correlation

MEAN Sample Mean: _ x =  Data Values n n = Number of Observations in Sample Population Mean: u =  Data Values N N = Number of Objects in Population

Median Middle value after the observations are ordered from smallest to largest 50% of the values to the right. 50% of the values to the left. Odd number of samples: Middle value of the ordered arrangement. Even number of samples: Average of the two middle values.

MODE The most frequent value that occurs in the data set.

Quartiles Divides data into four equal parts. Interquartile Range = Q 3 – Q 1

Trimmed Means Mean obtained from trimming off  % of the observations from “each” side of a data set.

Range Difference between the largest & smallest values.

Standard Deviation The square root of the average squared deviation from the mean. _ s = [  (x i – x) 2 / (n-1)] 1/2 Short Cut Method: s = [(  x i 2 – (  x i ) 2 / n) / (n-1)] 1/2

Variance Square of the Standard Deviation.

Boxplots Information Conveyed: > Center > Spread > Nature of Symmetry > Identification of Outliers

Build Boxplots On 1. Smallest Value 2. Lower Fourth 3. Median 4. Upper Fourth 5. Largest Value Fourth Spread = Upper Fourth – Lower Fourth

Construction Of Boxplot 1. Order data from smallest to largest. 2. Separate smallest half from the largest half. (If n is odd include the median in both halves). 3. Lower fourth is the median of the smallest half. 4. Upper fourth is the median of the largest half. 5. Fourth spread = Upper fourth – Lower fourth. 6. On a horizontal measurement scale, the left edge of a rectangle is the lower fourth & the right edge is the upper fourth. 7. Place a vertical line inside the rectangle at the location of the median. 8. Draw whiskers out from ends of the rectangle to the smallest and largest data values.