# 1 שיטות גראפיות פשוטות להצגה וניתוח נתונים. 2 Star/Radar/Spider Plot Figure 1: A typical radar graph with two plots.

## Presentation on theme: "1 שיטות גראפיות פשוטות להצגה וניתוח נתונים. 2 Star/Radar/Spider Plot Figure 1: A typical radar graph with two plots."— Presentation transcript:

1 שיטות גראפיות פשוטות להצגה וניתוח נתונים

2 Star/Radar/Spider Plot Figure 1: A typical radar graph with two plots

3 דיאגראמה עכביש

4 Purpose The star plot is a method of displaying multivariate data. Each star represents a single observation. Typically, star plots are generated in a multi-plot format with many stars on each page and each star representing one observation. Star plots are used to examine the relative values for a single data point (e.g., point 3 is large for variables 2 and 4, small for variables 1, 3, 5, and 6) and to locate similar points or dissimilar points

5 Sample Plot The plot below contains the star plots of 16 cars. The variable list for the sample star plot is: 1. Price 2. Mileage (MPG) 3. 1978 Repair Record (1 = Worst, 5 = Best) 4. 1977 Repair Record (1 = Worst, 5 = Best) 5. Headroom 6. Rear Seat Room 7. Trunk Space 8. Weight 9. Length

6

7 We can look at these plots individually or we can use them to identify clusters of cars with similar features. We can look at the star plot of the Cadillac Seville : it is one of the most expensive cars, gets below average (but not among the worst) gas mileage, has an average repair record, and has average-to-above-average roominess and size. We can then compare the Cadillac models (the last three plots) with the AMC models (the first three plots). The AMC models tend to be inexpensive, have below average gas mileage, and are small in both height and weight and in roominess. The Cadillac models are expensive, have poor gas mileage, and are large in both size and roominess.

8 Questions The star plot can be used to answer the following questions: What variables are dominant for a given observation? Which observations are most similar, i.e., are there clusters of observations? Are there outliers?

9 Weakness in Technique Star plots are helpful for small-to-moderate-sized multivariate data sets. Their primary weakness is that their effectiveness is limited to data sets with less than a few hundred points. After that, they tend to be overwhelming.

10 דיאגראמת עוגה

11 דיאגראמת עוגה

12 Pivot Chart

13 Histogram http://www.stat.sc.edu/~west/javahtml/Histogram.html http://www.stat.sc.edu/~west/javahtml/Histogram.html

14 Cumulative Histogram

15 Bihistogram

16 דיאגראמה פארטו

17 דיאגראמה פארטו

18 דיאגראמה פארטו

19 Box-and-Whisker Plot (1) 18 27 34 52 54 59 61 68 78 82 85 87 91 93 100 68 is the median 52 is the lower quartile 87 is the upper quartile 35 is the interquartile range (IQR)

20 Box-and-Whisker Plot (2)

21 Box-and-Whisker Plot (3) There is a useful variation of the box plot that more specifically identifies outliers. To create this variation: Calculate the median and the lower and upper quartiles.medianlower and upper quartiles Plot a symbol at the median and draw a box between the lower and upper quartiles. Calculate the interquartile range (the difference between the upper and lower quartile) and call it IQ. Calculate the following points: L1 = lower quartile - 1.5*IQ L2 = lower quartile - 3.0*IQ U1 = upper quartile + 1.5*IQ U2 = upper quartile + 3.0*IQ The line from the lower quartile to the minimum is now drawn from the lower quartile to the smallest point that is greater than L1. Likewise, the line from the upper quartile to the maximum is now drawn to the largest point smaller than U1. Points between L1 and L2 or between U1 and U2 are drawn as small circles. Points less than L2 or greater than U2 are drawn as large circles. Questions The box plot can provide answers to the following questions: Is a factor significant? Does the location differ between subgroups? Does the variation differ between subgroups? Are there any outliers? Importance: Check the significance of a factor The box plot is an important EDA tool for determining if a factor has a significant effect on the response with respect to either location or variation. The box plot is also an effective tool for summarizing large quantities of information.

22 Box-and-Whisker Plot (4)

23 Box-and-Whisker Plot (5)

24 דיאגראמה פיזור

25 Scatter Plot: No Relationship

26 Scatter Plot: Strong Linear (positive correlation) Relationship

27 Scatter Plot: Strong Linear (negative correlation) Relationship

28 Scatter Plot: Exact Linear (positive correlation) Relationship

30 Scatter Plot: Sinusoidal Relationship (damped)

31 Scatter Plot: Variation of Y Does Not Depend on X

32 Scatter Plot: Variation of Y Does Depend on X

33 Scatter Plot: Outlier

34

35 תרשים רץ (1)

36 תרשים רץ (2)

37 תרשים רץ + גבולות בקרה = תרשים בקרה

38 Lag Plot-(1)

39 Lag Plot (2) x t-1 xtxtxtxt New Point Interpolate these… To get the final prediction

40 Lag Plot: Random Data

41 Lag Plot: Moderate Autocorrelation

42 Lag Plot: Strong Autocorrelation and Autoregressive Model

43 Lag Plot: Sinusoidal Models and Outliers