1.1 example these are prices for Internet service packages find the mean, median and mode determine what type of data this is create a suitable frequency.

Slides:



Advertisements
Similar presentations
Chapter 4 The Relation between Two Variables
Advertisements

Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
Describing the Relation Between Two Variables
The Simple Regression Model
Correlation and Regression. Correlation What type of relationship exists between the two variables and is the correlation significant? x y Cigarettes.
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Business Statistics - QBM117 Least squares regression.
We’re ‘NUT’ Giving Up Fundraiser One Grand Prize  Airline tickets Montreal/Ft Lauderdale Return  3-Nights’ Accommodation at Marriott Fort Lauderdale.
Descriptive Methods in Regression and Correlation
Linear Regression.
Introduction to Linear Regression and Correlation Analysis
Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use Massive amounts of data can be presented more readily Can.
Yesterday’s example these are prices for Internet service packages find the mean, median and mode determine what type of data this is create a suitable.
Displaying Data Visually
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
Biostatistics Unit 9 – Regression and Correlation.
Chapter 6: Exploring Data: Relationships Lesson Plan Displaying Relationships: Scatterplots Making Predictions: Regression Line Correlation Least-Squares.
Chapter 6 & 7 Linear Regression & Correlation
Quantitative Skills 1: Graphing
Chapter 1 Review MDM 4U Mr. Lieff. 1.1 Displaying Data Visually Types of data Quantitative Discrete – only whole numbers are possible Continuous – decimals/fractions.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Statistical Analysis Topic – Math skills requirements.
1.1 Displaying Data Visually Learning goal:Classify data by type Create appropriate graphs MSIP / Home Learning: p. 11 #2, 3ab, 4, 7, 8.
Literacy in Math: Math in the Media: Be Informed! (p. 293) Chapter 7 – One-Variable Data Pearson Math 11 MBF 3C There are 3 kinds of lies: lies, damn lies.
Chapter 1.5 – The Media Mathematics of Data Management (Nelson) MDM 4U
Welcome to MDM4U (Mathematics of Data Management, University Preparation)
Chapter 10 Correlation and Regression
Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship.
Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Examining Relationships in Quantitative Research
Scatterplots are used to investigate and describe the relationship between two numerical variables When constructing a scatterplot it is conventional to.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
1.1 example these are prices for Internet service packages find the mean, median and mode determine what type of data this is create a suitable frequency.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
1.5: The Power of Data - The Media
Trends in Data Chapter 1.3 – Visualizing Trends Mathematics of Data Management (Nelson) MDM 4U.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Creating a Residual Plot and Investigating the Correlation Coefficient.
Correlation The apparent relation between two variables.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
7.1 Draw Scatter Plots and Best Fitting Lines Pg. 255 Notetaking Guide Pg. 255 Notetaking Guide.
Statistics Unit 9 only requires us to do Sections 1 & 2. * If we have time, there are some topics in Sections 3 & 4, that I will also cover. They tie in.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Welcome to MDM4U (Mathematics of Data Management, University Preparation)
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Welcome to MDM4U (Mathematics of Data Management, University Preparation)
Yesterday’s example these are prices for Internet service packages find the mean, median and mode determine what type of data this is create a suitable.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Chapter 1-2 Review MDM 4U Mr. Lieff. Ch1 Learning Goals Classify data as Quantitative (and continous or discrete) or Qualitatitive Identify the population,
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
Unit 1 Review. 1.1: representing data Types of data: 1. Quantitative – can be represented by a number Discrete Data Data where a fraction/decimal is not.
Clinical Calculation 5th Edition
1.3 Trends in Data Due now: p. 20–24 #1, 4, 9, 11, 14
Multiple Regression.
We’re ‘Nut’ Giving Up Fundraiser
Correlation and Regression
Suppose the maximum number of hours of study among students in your sample is 6. If you used the equation to predict the test score of a student who studied.
Technical Writing (AEEE299)
Constructing and Interpreting Visual Displays of Data
Algebra Review The equation of a straight line y = mx + b
Descriptive Statistics
Presentation transcript:

1.1 example these are prices for Internet service packages find the mean, median and mode determine what type of data this is create a suitable frequency table, stem and leaf plot and graph

Answers to yesterday’s problem Mean = /30 = Median = average of 15 th and 16 th numbers Median = ( )/2 = Mode = and  bimodal What type of data? numerical, so at least Interval data. It has an absolute starting point, so it is ratio data Given this, a histogram is appropriate

Frequency Table Class IntervalFrequency – – – – – – – –

Stem and Leaf Plot StemLeaf

Histogram How many class intervals? What does the height of each bar mean? What does the histogram tell us about the data?

Trends in Data Chapter 1.3 – Visualizing Trends Mathematics of Data Management (Nelson) MDM 4U

Variables In mathematics, a variable is a symbol denoting a quantity or symbolic representation. In mathematics, a variable often represents an unknown quantity.mathematics quantitysymbolic representation In statistics, variables refer to measurable attributes, as these typically vary over time or between individuals. Variables can be discrete (taking values from a finite or countable set), continuous (having a continuous distribution function) or neither. Temperature is a continuous variable, while the number of siblings is a discrete variable.statisticsdiscretecountablecontinuous distribution function Variables are often contrasted with constants, which are known and unchanging. (Wikipedia, 2004, 2008)constants

The Two Types of Variables Independent Variable  a variable whose values are arbitrarily chosen  placed on the horizontal axis  time is always independent (why?) Dependent Variable  a variable whose values depend on the independent variable  placed on the vertical axis A graph of arm span vs. height means arm span is the dependent variable

Scatter Plots a graphical method of showing the joint distribution of two variables where each axis represents a variable and each point on the graph indicates a pair of values may show a trend a trend indicates a correlation that may be strong or weak, positive or negative, linear or non-linear

What is a trend? a pattern of average behavior that occurs over time a general “direction” that something tends toward E.g., there has been a trend towards increasing costs in Canada need two variables to exhibit a trend

An Example of a trend U.S. population from 1780 to 1960 what is the trend? is the trend linear?

Line of Best Fit the line of best fit is a line which best represents the trend in the data and is used for making predictions these can be drawn by hand but there are also methods for mathematically calculating them (median-median and least squares methods are examples that we will study) gives no indication of the strength of the trend (use the r or r 2 value  1.4)

An example of the line of best fit this is temperature data from New York over time, with a median-median line added what type of trend are we looking at? see p35 for method for creating a median-median line

Creating a Median-Median Line Divide the points into 3 symmetric groups  If there is 1 extra point, include it in the middle group  If there are 2 extra points, include one in each end group Calculate the median x- and y-coordinates for each group and plot the median point (x, y) If the median points are on a straight line, connect them Otherwise, line up the two outer points, move 1/3 of the way to the middle point and draw a line of best fit

Median-Median Line (10 points)

Median-Median Line (14 points)

Creating a Median-Median Line Using Technology Click on the wiki Right-click the file armspan_v_height_4_ med- med.ftm and save to your M:\ or USB drivearmspan_v_height_4_ med- med.ftm Open the file Create a scatter plot for each set of data Right-click and select  Median-Median Line

MSIP / Homework Complete p. 37 #2, 3, 6, 8

Trends in Data Using Technology Chapter 1.4 – Trends in Technology Mathematics of Data Management (Nelson) MDM 4U

Categories of Correlation correlation scatter plots can be positive or negative, strong or weak There can also be no correlation between two variables look at the Correlation Picture and Regression Line examples on this website to help you understand:

Regression a process of fitting a line or curve to a set of data if a line is used, it is linear regression if a curve is used, it may be quadratic regression, cubic regression, etc. why do we do this? what can we do with the resulting function

Correlation Coefficient the correlation coefficient, r, is an indicator of the strength and direction of a linear relationship  r = 0no relationship  r = 1perfect positive correlation  r = -1perfect negative correlation r 2 is the coefficient of determination  if r 2 = 0.42, that means that 42% of the variation in y is due to x

Residuals a residual is the vertical distance between a point and the line of best fit if the model you are considering is a good fit, the residuals should be small and have no noticeable pattern The least-squares line minimizes the sum of the squares of the residuals

MSIP / Homework Complete p. 51 #1-6, 7 bcd, 8

Linear Regression Weight vs. Height (NHL) w = 7.23h – 325

Using the equation How much does a player who is 203cm tall weigh? 203 cm ÷ 2.54 = 71” w = 7.23(741 – 325)  = lbs How tall is a player who weighs 180 lbs?  w = 7.23h – 325  h = (w+325)÷7.23  So h = ( )÷7.23 = 69.85” or 177.4cm

References Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from

The Power of Data Chapter 1.5 – The Media Mathematics of Data Management (Nelson) MDM 4U There are 3 kinds of lies: lies, damn lies and statistics.

Example 1 – Changing the scale on the axis Why is the following graph misleading?

Example 1 – Scale from 0 Consider that this is a bar graph – could it still be misleading?

Include every category!

Example 2 – Using a Small Sample What is the smallest possible sample size for the following statements?  “4 out of 5 dentists recommend Trident sugarless gum to their patients who chew gum.”  “In the past, we found errors in 4 out of 5 of the returns people brought in for a Second Look review.” (H&R Block)  “Did you know that 1 in 4 women can misread a traditional pregnancy test result?” (Clearblue Easy Digital Pregnancy Test)  “Using Pedigree® DentaStix® daily can reduce the build up of tartar by up to 80%.”

Details on the Trident Survey How many dentists did they ask?  Actual number: out of 5 is convincing but reasonable  5 out of 5 is preposterous  3 out of 5 is good but not great  Actual statistic 85% Recommend Trident over what?  There were 2 other options: Chewing sugared gum Not chewing gum

Misleading Statements(?) How could these statements be misleading? “More people stay with Bell Mobility than any other provider.” “Every minute of every hour of every business day, someone comes back to Bell.” 60 x 60 x 7 x 5 =

2) How could the data be used to arrive at this conclusion falsely? Does not specify how many more customers stay with Bell.  e.g. Percentage of customers renewing their plan: Bell: 30% Rogers: 29% Telus: 25% Fido: 28% Did they compare percentages or totals? What does it mean to “stay with Bell”? Honour entire contract? Renew contract at the end of a term? Are early terminations factored in? If so, does Bell have a higher cost for early terminations? Competitors’ renewal rates may have decreased due to family plans / bundling Does the data include Private / Corporate plans?

How does the media use (misuse) data? To inform the public about world events in an objective manner It sometimes gives misleading or false impressions to sway the public or to increase ratings It is important to:  Study statistics to understand how information is represented or misrepresented  Correctly interpret tables/charts presented by the media

MSIP / Homework Read pp. 57 – 60 Ex. 1-2 p. 60 #1-6 Final Project Example – Manipulating Data (on wiki) Final Project Example – Manipulating Data Examples