Math 116 Chapter 12.

Slides:



Advertisements
Similar presentations
Brought to you by Tutorial Support Services The Math Center.
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Agricultural and Biological Statistics
Measures of Dispersion
Introduction to Summary Statistics
1 Chapter 1: Sampling and Descriptive Statistics.
Calculating & Reporting Healthcare Statistics
Chapter 3 Describing Data Using Numerical Measures
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Slides by JOHN LOUCKS St. Edward’s University.
Introduction to Educational Statistics
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 3 Describing Data Using Numerical Measures.
Describing Data: Numerical Measures
Describing Data: Numerical
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
1 Tendencia central y dispersión de una distribución.
Describing distributions with numbers
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Chapter 1 Exploring Data
Chapter 1: Exploring Data AP Stats, Questionnaire “Please take a few minutes to answer the following questions. I am collecting data for my.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Descriptive Statistics: Numerical Methods
Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems.
Chapter 2 Describing Data.
Describing distributions with numbers
Describing Data Lesson 3. Psychology & Statistics n Goals of Psychology l Describe, predict, influence behavior & cognitive processes n Role of statistics.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
The Central Tendency is the center of the distribution of a data set. You can think of this value as where the middle of a distribution lies. Measure.
INVESTIGATION 1.
Business Statistics Spring 2005 Summarizing and Describing Numerical Data.
Chapter 3, Part A Descriptive Statistics: Numerical Measures n Measures of Location n Measures of Variability.
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
1 Descriptive Statistics Descriptive Statistics Ernesto Diaz Faculty – Mathematics Redwood High School.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher.
Descriptive Statistics for one variable. Statistics has two major chapters: Descriptive Statistics Inferential statistics.
MATH 1107 Elementary Statistics Lecture 3 Describing and Exploring Data – Central Tendency, Variation and Relative Standing.
CHAPTER 2: Basic Summary Statistics
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
CHAPTER 3 – Numerical Techniques for Describing Data 3.1 Measures of Central Tendency 3.2 Measures of Variability.
MR. MARK ANTHONY GARCIA, M.S. MATHEMATICS DEPARTMENT DE LA SALLE UNIVERSITY.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Describing Data: Numerical Measures Chapter 3.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 18.
Exploratory Data Analysis
An Introduction to Statistics
Descriptive Statistics ( )
Analysis and Empirical Results
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Descriptive Statistics
Description of Data (Summary and Variability measures)
Numerical Descriptive Measures
Descriptive Statistics
MEASURES OF CENTRAL TENDENCY
Basic Statistical Terms
BUS7010 Quant Prep Statistics in Business and Economics
Numerical Descriptive Measures
Statistics: The Interpretation of Data
Numerical Descriptive Statistics
Chapter 1: Exploring Data
CHAPTER 2: Basic Summary Statistics
Numerical Descriptive Measures
Presentation transcript:

Math 116 Chapter 12

Topics: Graphical Display – histogram. Using numbers – measures of center and spread. Sampling and Law of Large Numbers

Definitions An observation is a single number. It could be a measurement, monetary amount, etc. It can also be considered to be one particular outcome to some random trial. Raw data is a collection of observations Frequency is how many observations are in each bin.

Graphical Display of Data There are two types of data: Quantitative e.g. closing prices, ratios Categorical e.g gender, political affiliation There are different ways to visually display data. Histograms are popularly used to display quantitative data. Pie charts are one way to display categorical data.

Histogram of the percentages of weekly ratios of Disney Stocks

More definitions Relative frequency is the percentage of observations in each bar. A frequency distribution is a chart which shows the bins and frequencies.

What are things to look for in a histogram: overall pattern or shape of the distribution major peaks rough symmetry or clear skewness (skewed to right or left) estimate the center and spread of the data look for any striking deviations from the pattern.

Another way of describing our data is the use of numbers: Measures of center or central tendency. - mean, median, mode Measures of spread or variation. - range, variance, standard deviation

How to find mean or average: “average” or mean is the sum of all observations divided by the number of observations

Another measure of center or central tendency: Median is the middle value of the data set. How to find the median (M): arrange data in ascending or descending order if odd number of observations, then M = (n+1)/2th if even number of observations, then M is the average of the two middle values ; n/2 th & n/2+1 th

Examples: e.g.: 40, 75, 80, 80, 96, 100 mean = 78.5 median = 80 e.g. 40, 75, 80, 96, 100 mean = 78.2 median = 80

Mean versus Median: mean is a common or more popular way to measure center but is more sensitive to extreme values than median. E.g. Which of the two measures better reflect the average price of a home? If the distribution is symmetric, mean and median are the same. If the distribution is skewed, the mean is farther out in the long tail than is the median.

Example: In 1993, the mean and median salaries paid to major league baseball players were $490,000 and $1,160,000. Which one is the mean? Median? Explain.

Example: Measures of center is not enough final exam in math class section 1: 80, 80, 80, 80, 80 final exam in math class section 2: 30, 80, 90, 100, 100 Note: mean = 80 but the datasets are different in the two sections. (Measuring center is not enough to describe the data; we need measures of spread)

Measures of Spread: Range: is the difference between the largest and the smallest observation. E.g. Let us look at 3 datasets below: A: 195, 200, 205, 215, 219, 225, 226, 235 B: 195, 210, 213, 214, 216, 218, 219, 235 C: 208, 209, 210, 210, 211, 211, 213, 248 the range for each dataset is 40 but the datasets are different from each other.

Range: strongly influenced by extreme values and takes only account two observations in the whole dataset. Standard Deviation s (the most common and popular): measures how far each observation is from the mean; the square root of the variance.

Formula for Standard deviation:

Let us try to find the variance and standard deviation by hand for one time only. Use Excel for other times. E.g. Math test score: 30, 80, 90, 100, 100

Back to the sample A:195, 200, 205, 215, 219, 225, 226, 235 B:195, 210, 213, 214, 216, 218, 219, 235 C:208, 209, 210, 210, 211, 211, 213, 248 Range for all three sets: 40 Mean for all three sets: 215 Sd for set A = 13.94 Sd for set B = 11.06 Sd for Set C = 13.42

Interpretations: Variance is the average of the squares of the deviations of each observation from the mean. Standard deviation is the square root of the variance. (to have the same units as the observation). Hence, it is a single value the measures the dispersion of the data about the mean. A larger standard deviation indicates a more spread set of data points. We use n-1 rather than n to get the average. (to be more conservative with our estimate).

Open excel file data.xls. In the second column, generate a new data = old data + constant. In the third column, generate a new data = old data multiplied by a constant. Find the mean, variance and standard deviation for each column. What do you notice?

Adding a number to each observation: If a number b is added (or subtracted): The mean increases (or decreases) by b. The variance does not changed. The standard deviation does not changed.

Multiplying a number to each observation: If each observation is multiplied by a number a: The mean is multiplied by a. The variance is multiplied by a2 The standard deviation is multiplied by a.

Sampling: Some definitions: Population: entire group of individuals or objects that we want information about. Sample: part of the population that we actually analyze in order to gather information.

Parameter: a number that describes a population. E. g Parameter: a number that describes a population. E.g. population mean, population standard deviation, etc. Statistic: a number that describes a sample. E.g sample mean, sample standard deviation, etc.

Reasons for sampling: Impossible to take measurements of the population. Samples, are quicker, easier, cheaper. If done properly, it is enough to give us needed information about the population.

Random Sampling: Simple random sampling: every one in the population has an equal chance of being selected in the sample. Types of random sampling: draw names from a hat, balls from a basket, etc. computer software to generate random numbers, table of random digits. Stratified random sampling E.g. example: seattle population: strata- economic status, race, gender, marital,etc. systematic random sampling – every 10th observation is chosen.

Law of Large Numbers: With random sampling and a large sample, we can use the statistic of a sample to estimate the parameter of a population.

Volatility It is a measurement of how much the value of a stock fluctuates. A common way of measuring the volatility of a stock is to find the annualized standard deviation of the ratios of closing prices of a stock. (weekly, in our project). There are other types of volatility but the one above is what we are going to use in our project.

To annualize the standard deviation: For monthly ratios, multiply the standard deviation by square root of 12. For weekly ratios (which we use), multiply by square root of 52 (52 weeks in a year). For daily ratios, multiply by square root of 252 (252 business days in a year).

Focus on the Project: Suppose our mean weekly ratio is 1.001894. Let’s call it Rm, for the “mean of the ratios”. From chapter 11, our computed weekly risk-free ratio is approximately 1.0007695. Let’s call it Rrf, for risk free rate. Note that Rm is too large.

Focus on the Project: This means that on the average, each of our weekly ratios is too large. Specifically, each ratio is in excess of (Rm-Rrf). In example above, 1.001894-1.0007695 = 0.0011245

To adjust our weekly ratios to equal the weekly risk-free rate: We can do this by reducing each ratio by (Rm-Rrf). We call this normalizing each ratio.

Hence, The normalized ratio the Ratio excess The weekly ratio

By normalizing our ratios: Our new mean will match the weekly risk-free rate. In our example above, New Mean = Old mean – (Rm-Rrf) = 1.001894 – 0.0011245 = 1.0007695