Yesterday’s example these are prices for Internet service packages find the mean, median and mode determine what type of data this is create a suitable.

Similar presentations

Presentation on theme: "Yesterday’s example these are prices for Internet service packages find the mean, median and mode determine what type of data this is create a suitable."— Presentation transcript:

1 Yesterday’s example these are prices for Internet service packages find the mean, median and mode determine what type of data this is create a suitable frequency table, stem and leaf plot and graph 13.60 15.60 17.20 16.00 17.50 18.6018.70 12.20 18.60 15.70 15.30 13.00 16.40 14.30 18.10 18.60 17.60 18.40 19.30 15.60 17.10 18.30 15.20 15.70 17.20 18.10 18.40 12.00 16.40 15.60

2 Answers to yesterday’s problem Mean = 494.30/30 = 16.50 Median = average of 15 th and 16 th numbers Median = (16.40 + 17.10)/2 = 16.75 Mode = 15.60 and 18.60  bimodal What type of data? numerical, so at least Interval data. It has an absolute starting point, so it is ratio data Given this, a histogram is appropriate

3 Frequency Table Class IntervalFrequency 12.00 – 12.992 13.00 – 13.992 14.00 – 14.991 15.00 – 15.997 16.00 – 16.993 17.00 – 17.995 18.00 – 18.999 19.00 – 19.991

4 Stem and Leaf Plot StemLeaf 12.20 00 13.60 00 14.30 15.60 70 30 60 20 70 60 16.00 40 40 17.20 50 60 10 20 18.60 70 60 10 60 40 30 10 40 19.30

5 Histogram How many class intervals? What does the height of each bar mean? What does the histogram tell us about the data?

6 CopyCat Assignment - Checkoff Log on and open your file

7 Trends in Data Chapter 1.3 – Visualizing Trends Mathematics of Data Management (Nelson) MDM 4U

8 Variables In mathematics, a variable is a symbol denoting a quantity or symbolic representation. In mathematics, a variable often represents an unknown quantity.mathematics quantitysymbolic representation In statistics, variables refer to measurable attributes, as these typically vary over time or between individuals. Variables can be discrete (taking values from a finite or countable set), continuous (having a continuous distribution function) or neither. Temperature is a continuous variable, while the number of legs of an animal is a discrete variable.statisticsdiscretecountablecontinuous distribution function Variables are often contrasted with constants, which are known and unchanging. (Wikipedia, 2004, 2008)constants

9 The Two Types of Variables Independent Variable  a variable whose values are arbitrarily chosen  placed on the horizontal axis  time is always independent (why?) Dependent Variable  a variable whose values depend on the independent variable  placed on the vertical axis

10 Scatter Plots a graphical method of showing the joint distribution of two variables where each axis represents a variable and each point on the graph indicates a pair of values may show a trend a trend indicates a correlation that may be strong or weak, positive or negative, linear or non-linear

11 What is a trend? a pattern of average behavior that occurs over time a general “direction” that something tends toward E.g., there has been a trend towards increasing costs in Canada need two variables to exhibit a trend

12 An Example of a trend U.S. population from 1780 to 1960 what is the trend? is the trend linear?

13 Line of Best Fit the line of best fit is a line which best represents the trend in the data and is used for making predictions these can be drawn by hand but there are also methods for mathematically calculating them (median-median and least squares methods are examples that we will study) gives no indication of the strength of the trend (use the r or r 2 value)

14 An example of the line of best fit this is temperature data from New York over time, with a median-median line added what type of trend are we looking at? see p35 for method for creating a median-median line

15 Creating a Median-Median Line Divide the points into 3 symmetric groups  If there is 1 extra point, include it in the middle group  If there are 2 extra points, group one in each end Calculate the median x- and y-coordinates for each group and plot the median point (x, y) If the median points are on a straight line, connect them Otherwise, line up the two outer points, move 1/3 of the way to the middle point and draw a line of best fit

16 Median-Median Line (10 points)

17 Median-Median Line (14 points)

18 Exercises try page 37 #2, 3, 6, 8

19 Trends in Data Using Technology Chapter 1.4 – Trends in Technology Mathematics of Data Management (Nelson) MDM 4U

20 Categories of Correlation correlation scatter plots can be positive or negative, strong or weak try looking at the examples on this website to help you understand (see Correlation Picture and Regression Line):

21 Regression a process of fitting a line or curve to a set of data if a line is used, it is linear regression if a curve is used, it may be quadratic regression, cubic regression, etc. why do we do this? what can we do with the resulting function? allery/CorrelationPicture.html allery/CorrelationPicture.html

22 Correlation Coefficient the correlation coefficient r is an indicator of the strength and direction of a linear relationship  r = 0no relationship  r = 1perfect positive correlation  r = -1perfect negative correlation r 2 is the coefficient of determination  if r 2 = 0.42, that means that 42% of the variation in y is due to x

23 Residuals a residual is the vertical distance between a point and the line of best fit if the model you are considering is a good fit, the residuals should be small and have no noticeable pattern why?

24 Creating a Median-Median Line Using Technology Click on the wiki Right-click the file armspan_v_height_4_ med- med.ftm and save to your M:\ or USB drivearmspan_v_height_4_ med- med.ftm Open the file Create a scatter plot for each set of data Right-click and select  Median-Median Line  Least Squares Line (to see r 2 value)  Make Residual Plot

25 Exercises Page 51 #1-6, 7 bcd, 8

26 References Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from

27 The Power of Data Chapter 1.5 – The Media Mathematics of Data Management (Nelson) MDM 4U There are 3 kinds of lies: lies, damn lies and statistics.

28 ‘ 4 out of 5 dentists surveyed would recommend sugarless gum to their patients who chew gum. ’ In small groups discuss how this statistical statement could be misleading

29 Trident conclusions How many dentists did they ask? 5? 4 out of 5 is convincing but reasonable  5 out of 5 is preposterous  3 out of 5 is good but not great Recommend Trident over what?  Chewing sugared gum?  Doing nothing? Is Trident the “best” sugarless gum? What competitors and variables were considered? What did the 5 th dentist recommend?

30 “More people stay with Bell Mobility than any other provider.” In small groups, discuss:  1) What variables would be recorded in this study?  2) How could the data be used to arrive at this conclusion falsely?

31 1) What variables would be recorded in this study? Number of Bell Mobility subscribers Number of renewed contracts Definition of renewed contract? Renewal during OR upon completion of contract? Contract Length Contract Type (business, home, bundle)

32 2) How could the data be used to arrive at this conclusion falsely? Does not specify how many more customers stay with Bell.  e.g. Percentage of customers renewing their plan: Bell: 30% Rogers: 29% Telus: 25% Fido: 28% Did they compare percentages or totals? What does it mean to “stay with Bell”? Honour entire contract? Renew contract at the end of a term? Are early terminations factored in? If so, does Bell have a higher cost for early terminations? Competitors’ renewal rates may have decreased due to family plans Does the data include Private / Corporate plans?

33 How does the media use (misuse) data? To inform the public about world events in an objective manner It sometimes gives misleading or false impressions to sway the public or to increase ratings It is important to:  Study statistics to understand how information is represented or misrepresented  Correctly interpret tables/charts presented by the media

34 Exercises p. 60 #1-6 Final Project – Manipulating Data

Download ppt "Yesterday’s example these are prices for Internet service packages find the mean, median and mode determine what type of data this is create a suitable."

Similar presentations

Ads by Google