C HAPTER 4: D ESCRIBING N UMERICAL D ATA H OMEWORK #3.

Slides:



Advertisements
Similar presentations
Very simple to create with each dot representing a data value. Best for non continuous data but can be made for and quantitative data 2004 US Womens Soccer.
Advertisements

Measures of Dispersion boxplots. RANGE difference between highest and lowest value; gives us some idea of how much variation there is in the categories.
Descriptive Measures MARE 250 Dr. Jason Turner.
Section 4.3 ~ Measures of Variation
Measures of Dispersion
Numerically Summarizing Data
Lecture 2 Describing Data II ©. Summarizing and Describing Data Frequency distribution and the shape of the distribution Frequency distribution and the.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Descriptive Statistics A.A. Elimam College of Business San Francisco State University.
PSY 307 – Statistics for the Behavioral Sciences
Familiar Stocks Disney, Exxon, and McDonalds
Describing Distributions Numerically
Box and Whisker Plots SWBAT create, read, and identify the values of a box and whisker plot.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Describing Data: Numerical
Department of Quantitative Methods & Information Systems
Describing distributions with numbers
Chapter 1 Exploring Data
Numerical Descriptive Techniques
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Measures of Variability In addition to knowing where the center of the distribution is, it is often helpful to know the degree to which individual values.
Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis.
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
Describing distributions with numbers
Descriptive Statistics1 LSSG Green Belt Training Descriptive Statistics.
Skewness & Kurtosis: Reference
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1)Construct a box and whisker plot for the data below that represents the goals in a soccer game. (USE APPROPRIATE SCALE) 7, 0, 2, 5, 4, 9, 5, 0 2)Calculate.
Categorical vs. Quantitative…
1 Chapter 4: Describing Distributions 4.1Graphs: good and bad 4.2Displaying distributions with graphs 4.3Describing distributions with numbers.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Agenda Descriptive Statistics Measures of Spread - Variability.
Business Statistics Spring 2005 Summarizing and Describing Numerical Data.
Numerical Measures of Variability
Chapter 2 Descriptive Statistics Section 2.3 Measures of Variation Figure 2.31 Repair Times for Personal Computers at Two Service Centers  Figure 2.31.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Sociology 5811: Lecture 3: Measures of Central Tendency and Dispersion Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Chapter 5 Describing Distributions Numerically.
Describing Distributions Numerically Measures of Variation And Boxplots.
MIA U2D9 Warmup: Construct a boxplot for the following data. Be sure to included whether or not there is an outlier and how you know. 23, 30, 22, 20, 20,
Common Core Math I Unit 1 Review Day One-Variable Statistics.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
LIS 570 Summarising and presenting data - Univariate analysis.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 5. Measuring Dispersion or Spread in a Distribution of Scores.
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
1 Day 1 Quantitative Methods for Investment Management by Binam Ghimire.
Statistics - is the science of collecting, organizing, and interpreting numerical facts we call data. Individuals – objects described by a set of data.
STATISTICS Chapter 2 and and 2.2: Review of Basic Statistics Topics covered today:  Mean, Median, Mode  5 number summary and box plot  Interquartile.
Chapter 3 Section 3 Measures of variation. Measures of Variation Example 3 – 18 Suppose we wish to test two experimental brands of outdoor paint to see.
Warm Up! Write down objective and homework in agenda Lay out homework (Box Plot & Outliers wkst) Homework (comparing data sets) Get a Calculator!!
AP PSYCHOLOGY: UNIT I Introductory Psychology: Statistical Analysis The use of mathematics to organize, summarize and interpret numerical data.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
Exploratory Data Analysis
Chapter 5 : Describing Distributions Numerically I
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Numerical Descriptive Measures
MEASURES OF CENTRAL TENDENCY
Practice Mid-Term Exam
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Numerical Descriptive Measures
Numerical Descriptive Measures
Summary (Week 1) Categorical vs. Quantitative Variables
Summary (Week 1) Categorical vs. Quantitative Variables
Numerical Descriptive Measures
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
Presentation transcript:

C HAPTER 4: D ESCRIBING N UMERICAL D ATA H OMEWORK #3

C HAPTER 4 P ROBLEM 54 Cars. A column in this data file gives the engine displacement in liters of 509 vehicles sold in the United States. These vehicles are 2012 models, are not hybrids, have automatic transmissions, and lack turbochargers. Another column in this data file cars gives the rated combined fuel economy (in miles per gallon) for 509 vehicles sold in the United States.

C HAPTER 4 P ROBLEM 54 ( A ) Produce a histogram of these data. Describe and interpret the histogram. The histogram extends from 10 to 37.5 with 2.5 sized bins. The histogram peaks in the bin The histogram is right skewed.

C HAPTER 4 P ROBLEM 54 ( B ) Compare the histogram to the boxplot. What does the histogram tell you that the boxplot does not, and vice versa? The boxplot tells you the mean, median, IQR, and that there is an outlier. The histogram shows more about the shape of the distribution and where observations actually locate.

C HAPTER 4 P ROBLEM 54 ( C ) Find the mean and standard deviation of the rated mileages. How are these related to the histogram, if at all? The mean = and standard deviation =4.81. The mean is the middle and the SD explains the deviation from the mean.

C HAPTER 4 P ROBLEM 54 ( D ) Find the coefficient of variation and briefly interpret its value. CV=24.03 A higher CV (100+) denotes variation. These data are not very spread out.

C HAPTER 4 P ROBLEM 54 ( E ) Identify any unusual values (outliers). Do you think that these are coding errors? There is one outlier at 37 mpg which is the Scion iQ. This probably isn’t a blunder or rogue but truly an interesting outlier

C HAPTER 4 P ROBLEM 54 ( F ) Government standards call for cars to get 27.5 MPG. What percentage of these vehicles meet this goal? (Are all of these vehicles cars?) We have created a variable which is one when the mileage is at least 27.5 and 0 otherwise. Then we simply need to sum the variable and divide by the sample size. This gives approximately 8%.

C HAPTER 4 P ROBLEM 57 Information Industry. This data table includes several characteristics of 428 companies classified as being in the information industry in One column gives the total revenue of the company, in millions of dollars.

C HAPTER 4 P ROBLEM 57( A ) Find the median, mean, and standard deviation of the total revenue of these companies. What units do these summary statistics share? Mean= Median= SD= They are all $1,000s

C HAPTER 4 P ROBLEM 57( B ) Describe the shape of the histogram and boxplot. What does the White Space Rule have to say about the histogram? It is all white space. The data is highly concentrated on the lower end and there are some outliers that are very very high. These outliers conceal much of the data.

C HAPTER 4 P ROBLEM 57( C ) Do the data have any extreme outliers? Identify the company if there’s an extreme outlier. AT&T, Verizon, and Microsoft are all extreme outliers.

C HAPTER 4 P ROBLEM 57( D ) What do these graphs of the distribution of net sales tell you about this industry? Is this industry dominated by a few companies, or is there a level playing field with many comparable rivals? There are several dominant companies at the top and there are many less competitive companies fighting at the bottom.

C HAPTER 4 P ROBLEM 59 Tech Stocks. These data give the monthly returns on stocks in three technology companies: Dell, IBM, and Microsoft. For each month from January 1990 through the end of 2005 (192 months), the data give the return earned by owning a share of stock in each company. The return is the percentage change in the price, divided by 100.

C HAPTER 4 P ROBLEM 59( A ) a. Describe and contrast histograms of the three companies. Be sure to use a common scale for the data axes of the histograms to make the comparison easier and more reliable.

C HAPTER 4 P ROBLEM 59( A ) The histograms, boxplots, and violin plots show that Dell has the highest median and IQR. Microsoft has the second highest median and IQR. IBM has lowest median and interquartile range. Microsoft has the most outliers.

C HAPTER 4 P ROBLEM 59( B ) Find the mean, SD, and coefficient of variation for each set of returns. Are means and SDs useful summaries of variables such as these? The means and standard deviations are regularly used to characterize the expected returns and risks of equity market data. Because this type of data often deviate from the assumptions of a normal distribution, we should exercise case when interpreting them.

C HAPTER 4 P ROBLEM 59( C ) What does comparison of the coefficients of variation tell you about these three stocks? The CVs tell us that Dell varies least, then Microsoft, and IBM varies the most. In this case, however, because the means are so close to zero, the CV’s are not good indicators of risk or scale. Coefficients of variation are valuable only when the means are not close to zero.

C HAPTER 4 P ROBLEM 59( D ) Investors prefer stocks that grow steadily. In that case, what values are ideal for the mean and SD of the returns? For the coefficient of variation? Investors would prefer smaller CVs that denote less variability. Investors would also like to see positively skewed data as well, which leans towards growth. In this case, because the means are so close to zero, the CV’s are not good indicators of risk or scale.

C HAPTER 4 P ROBLEM 59( E ) It is common to find that stocks that have a high average return also tend to be more volatile, with larger swings in price. Is that true for these three stocks? Yes. The highest means/medians have the highest SD.