Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 1 Describing Data with Graphs. Variables and Data variable A variable is a characteristic that changes over time and/or for different individuals.

Similar presentations


Presentation on theme: "Chapter 1 Describing Data with Graphs. Variables and Data variable A variable is a characteristic that changes over time and/or for different individuals."— Presentation transcript:

1 Chapter 1 Describing Data with Graphs

2 Variables and Data variable A variable is a characteristic that changes over time and/or for different individuals or objects under consideration. Examples: Examples: Body temperature. Hair color. Time to failure of a computer component.

3 xperimental Unit and Measurement E xperimental Unit and Measurement experimental unit An experimental unit is the individual or object on which a variable is measured. measurement A single measurement results when a variable is actually measured on an experimental unit. data. A set of measurements is called data.

4 Example: Hair Color Variable Hair color Experimental unit Person Typical Measurements Brown, black, blonde, etc.

5 Example Variable Time until a light bulb burns out Experimental unit Light bulb Typical Measurements 1500 hours, 1535.5 hours, etc.

6 Population and Sample A population A population is the set of all measurements of interest to the investigator. Examples: Body temperatures of all healthy people in the world. Lifetime of a batch of 1000 light bulbs It might be too expensive or even impossible to enumerate the entire population.

7 Population Sample Asample A sample is a subset of measurements selected from the population of interest.

8 Sampling Sample Population

9 How many variables have you measured? Univariate data: Univariate data: One variable is measured on a single experimental unit. Bivariate data: Bivariate data: Two variables are measured on a single experimental unit. Multivariate data: Multivariate data: More than two variables are measured on a single experimental unit.

10 Types of Variables Qualitative Quantitative Discrete Continuous

11 Qualitative variables Qualitative variables measure a quality or characteristic on each experimental unit. (Data collected is sometimes called Categorical Data)Examples: Hair color (black, brown, blonde…) Make of car (Dodge, Honda, Ford…) Gender (male, female) State of birth (California, Arizona,….) Qualitative Qualitative Variables

12 Quantitative variables Quantitative variables measure a numerical quantity on each experimental unit. Discrete Discrete if it can assume only a finite or countable number of values. Continuous Continuous if it can assume the infinitely many values corresponding to the points on a line interval. Quantitative Quantitative Variables

13 Examples For each orange tree in a grove, the number of oranges is measured. Quantitative discrete For a particular day, the number of cars entering a college campus is measured. Quantitative discrete Time until a light bulb burns out Quantitative continuous

14 Graphing Qualitative Variables data distribution Use a data distribution to describe: What valuesmeasurements What values (measurements) of the variable have been measured How oftenmeasurement How often each value (measurement) has occurred “How often” can be measured 3 ways: Frequency Relative frequency = Frequency/n Percent = 100 x Relative frequency

15 Example A bag of M&Ms contains 25 candies: Raw Data: Raw Data: Statistical Table: Statistical Table: ColorTallyFrequencyRelative Frequency Percent Red33/25 =.1212% Blue66/25 =.2424% Green44/25 =.1616% Orange55/25 =.2020% Brown33/25 =.1212% Yellow44/25 =.1616% m m mm m m m m m m m m m m m m m m mmm mmm mmmm mmmmmm mmmm mmmm m m m m m m m m

16 Bar Chart Pie Chart

17 Pareto Bar Chart Pareto Bar Chart A Pareto Bar Chart is a bar chart where the bars are ordered from largest to smallest.

18 Graphing Quantitative Variables pie bar chart A single quantitative variable measured for different population segments or for different categories of classification can be graphed using a pie or bar chart. A Big Mac hamburger costs $4.90 in Switzerland, $2.90 in the U.S. and $1.86 in South Africa.

19 time series linebar chartA single quantitative variable measured over time is called a time series. It can be graphed using a line chart or bar chart. SeptOctNovDecJanFebMar 178.10177.60177.50177.30177.60178.00178.60 Example: Consumer Price Index: BUREAU OF LABOR STATISTICS

20 Dotplots For quantitative data, plots the measurements as points on a horizontal axis, stacking the points that duplicate existing points. Example: Example: The set 4, 5, 5, 7, 6 45674567

21 Stem and Leaf Plots For quantitative data, use the actual numerical values of each data point. –Divide each measurement into two parts: the stem and the leaf. –List the stems in a column, with a vertical line to their right. –For each measurement, record the leaf portion in the same row as its matching stem. –Order the leaves from lowest to highest in each stem. –Divide each measurement into two parts: the stem and the leaf. –List the stems in a column, with a vertical line to their right. –For each measurement, record the leaf portion in the same row as its matching stem. –Order the leaves from lowest to highest in each stem.

22 Example The prices ($) of 18 brands of walking shoes: 907070707570656860 747095757068654065 40 5 65 8 0 8 5 5 70 0 0 5 0 4 0 5 0 8 90 5 40 5 60 5 5 5 8 8 70 0 0 0 0 0 4 5 5 8 90 5 Reorder

23 Where is the data centered on the horizontal axis, and how does it spread out from the center? Interpreting Graphs: Location and Spread

24 Interpreting Graphs: Shapes Mound shaped and symmetric (mirror images) Skewed right: a few unusually large measurements Skewed left: a few unusually small measurements Bimodal: two local peaks

25 Interpreting Graphs: Outliers Are there any strange or unusual measurements that stand out in the data set? Outlier No Outliers

26 Example A quality control process measures the diameter of a gear being made by a machine (cm). The technician records 15 diameters, but inadvertently makes a typing mistake on the second entry. 1.9911.8911.9911.9881.993 1.9891.9901.988 1.9881.9931.9911.9891.9891.9931.9901.994

27 Interpreting Graphs: Check the horizontal and vertical scalesCheck the horizontal and vertical scales Examine the location of the data distributionExamine the location of the data distribution Examine the shape of the distributionExamine the shape of the distribution Look for any unusual outlier.Look for any unusual outlier.

28 Relative Frequency Histograms relative frequency histogram A relative frequency histogram for a quantitative data set is a bar graph in which the height of the bar shows “how often” (measured as a proportion or relative frequency) measurements fall in a particular class or subinterval. Create intervals Stack and draw bars

29 Relative Frequency Histograms

30 Example The ages of 50 tenured faculty at a state university. 34 48 70 63 52 52 35 50 37 43 53 43 52 44 42 31 36 48 43 26 58 62 49 34 48 53 39 45 34 59 34 66 40 59 36 41 35 36 62 34 38 28 43 50 30 43 32 44 58 53 6 We choose to use 6 intervals. =(70 – 26)/6 = 7.33 Minimum class width = (70 – 26)/6 = 7.33 = 8 Convenient class width = 8 6825. Use 6 classes of length 8, starting at 25.

31 AgeTallyFrequencyRelative Frequency Percent 25 to < 33111155/50 =.1010% 33 to < 411111 1111 11111414/50 =.2828% 41 to < 491111 1111 1111313/50 =.2626% 49 to < 571111 99/50 =.1818% 57 to < 651111 1177/50 =.1414% 65 to < 731122/50 =.044%

32 Relative Frequency Histograms 5-12 subintervals Divide the range of the data into 5-12 subintervals of equal length. minimum width Calculate the minimum width of the subinterval as Range/Number. Round the minimum width up to a convenient value. left inclusion Use the method of left inclusion,including the left endpoint, but not the right in your tally.

33 statistical table Create a statistical table including the subintervals, their frequencies and relative frequencies. relative frequency histogram Draw the relative frequency histogram, plotting the subintervals on the horizontal axis and the relative frequencies on the vertical axis.

34 The height of the bar represents proportion The proportion of measurements falling in that class or subinterval. probability The probability that a single measurement, drawn randomly from the set, will belong to that class or subinterval.

35 Shape? Outliers? What proportion of the tenured faculty are younger than 41? What is the probability that a randomly selected faculty member is 49 or older? Skewed right No. (14 + 5)/50 = 19/50 =.38 (9 + 7 + 2)/50 = 18/50 =.36 Describing the Distribution

36 Chapter review I. How Data Are Generated Experimental units, variables, measurements Samples and populations Univariate, bivariate, and multivariate data Qualitative or Categorical Quantitative a. Discrete b. Continuous II. Types of Variables

37 III. Graphs for Univariate Data Distributions 2. Quantitative data a. Pie and bar charts b. Line charts c. Dot plots d. Stem and leaf plots e. Relative frequency histograms 1. Qualitative or categorical data a. Pie charts b. Bar charts

38 3. Describing data distributions Shapes — symmetric, skewed left, skewed right, unimodal, bimodal Proportion of measurements in certain intervals Outliers

39 A Manufacturer of jeans has plants in CA, AZ and TX. A randomly selected 25 pairs of jeans shows their plants as follows CAAZ TXCA TX AZ CAAZTX CAAZTX CAAZ CA Example

40 What is the variable? Is it qualitative or quantitative? State Qualitative What is the experimental unit?Pair of jeans

41 Construct a pie chart Construct a statistical table StateFrequencyRelative FrequencySector Angle CA9.36129.6 AZ8.32115.2 TX8.32115.2

42

43 Construct a bar chart to describe the data StateFrequencyRelative FrequencySector Angle CA9.36129.6 AZ8.32115.2 TX8.32115.2

44

45 What state produces the most jeans in the group? What proportion of the jeans are made in TX? California 8/25=32%

46 The age (in months) at which 50 children were enrolled in a preschool are listed 3840303539 4735344341 3234413046 55393332 4250373933 4048363136 41434840 3540304637 4542413650 4538463631 Example

47 Construct a stem and leaf to display the data Use the tens digit as the stem, and the ones digit as the leaf, dividing each stem into two parts.

48 334455334455 0 4 2 4 0 3 2 2 3 1 0 1 8 5 9 5 9 7 9 6 6 6 5 7 6 8 6 0 3 1 1 2 0 1 3 0 0 2 1 7 6 8 8 6 5 5 6 0 5

49 Reorder 334455334455 0 0 0 11 2 2 2 3 3 4 4 5 5 5 6 6 6 6 6 7 7 8 8 9 9 9 0 0 0 0 1 1 1 1 2 2 3 3 5 5 6 6 6 7 8 8 0 5

50 What is the shape of the measurements? 334455334455 0 0 0 11 2 2 2 3 3 4 4 5 5 5 6 6 6 6 6 7 7 8 8 9 9 9 0 0 0 0 1 1 1 1 2 2 3 3 5 5 6 6 6 7 8 8 0 5 Rotate 90 degree counterclockwise Unimodal

51 Construct a relative frequency histogram. Start the lower boundary of the first class at 30 and use a class width of 5. ClassBoundaryFrequencyRelative Freq. 130 to < 35120.24 235 to < 40150.30 340 to < 45120.24 445 to < 5080.16 550 to < 5520.04 655 to < 6010.02

52 If one child is selected at random, what is probability that the child was less than 50 months? What proportion of the children were 35 month or older, but less than 45 months of age? (12+15+12+8)/50=0.94 (15+12)/50=0.54

53 The value of a quantitative variable is measured once a year for ten year period. yearMeasur.yearMeasur. 161.5658.2 262.3757.5 360.7857.5 459.8956.1 558.01056.0 Example

54 Create a line chart to describe the variable as it changes over time.

55 Describle the measurements using the line chart. Observing the change in y as x increases, we see that the measurements are decreasing over time.

56 Status of Students StatusFreshSophom.JuniorSeniorGrad Frequency 523323510 Minitab

57 Assignment Questions (due at 12:00, Wed, Sept. 5) 1.2, 1.3, 1.5, 1.10, 1.20, 1.23


Download ppt "Chapter 1 Describing Data with Graphs. Variables and Data variable A variable is a characteristic that changes over time and/or for different individuals."

Similar presentations


Ads by Google