1 A REVIEW OF QUME 232  The Statistical Analysis of Economic (and related) Data.

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Econ Prof. Buckles1 Welcome to Econometrics What is Econometrics?
Economics 20 - Prof. Anderson
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 4: Mathematical Tools for Econometrics Statistical Appendix (Chapter 3.1–3.2)
Suggestions for Best Individual Performance
Calculating & Reporting Healthcare Statistics
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Chap 3-1 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 3 Describing Data: Numerical.
Linear Regression with One Regression
Introduction to Econometrics The Statistical Analysis of Economic (and related) Data.
Presentation on Statistics for Research Lecture 7.
Introduction to Econometrics The Statistical Analysis of Economic (and related) Data.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
The Simple Regression Model
Topic 2: Statistical Concepts and Market Returns
Business and Economics 7th Edition
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Standard error of estimate & Confidence interval.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Chapter 2: The Research Enterprise in Psychology
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Describing Data: Numerical
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Correlation.
Chapter 3 – Descriptive Statistics
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-1 Review and Preview.
Statistics Primer ORC Staff: Xin Xin (Cindy) Ryan Glaman Brett Kellerstedt 1.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Topic 5 Statistical inference: point and interval estimate
Statistics 101 Chapter 10. Section 10-1 We want to infer from the sample data some conclusion about a wider population that the sample represents. Inferential.
PARAMETRIC STATISTICAL INFERENCE
Lecture 3 A Brief Review of Some Important Statistical Concepts.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Research Process Parts of the research study Parts of the research study Aim: purpose of the study Aim: purpose of the study Target population: group whose.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Presentation on Statistics for Research Lecture 7.
Descriptive & Inferential Statistics Adopted from ;Merryellen Towey Schulz, Ph.D. College of Saint Mary EDU 496.
Review Lecture 51 Tue, Dec 13, Chapter 1 Sections 1.1 – 1.4. Sections 1.1 – 1.4. Be familiar with the language and principles of hypothesis testing.
Introduction to Basic Statistical Tools for Research OCED 5443 Interpreting Research in OCED Dr. Ausburn OCED 5443 Interpreting Research in OCED Dr. Ausburn.
Data Analysis.
Statistics What is statistics? Where are statistics used?
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Describing Samples Based on Chapter 3 of Gotelli & Ellison (2004) and Chapter 4 of D. Heath (1995). An Introduction to Experimental Design and Statistics.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 4-1 Basic Mathematical tools Today, we will review some basic mathematical tools. Then we.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
Slide Slide 1 Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10-3 Regression 10-4 Variation and Prediction Intervals 10-5 Multiple.
Data analysis and basic statistics KSU Fellowship in Clinical Pathology Clinical Biochemistry Unit
Outline Sampling Measurement Descriptive Statistics:
Business and Economics 6th Edition
Introductory Statistics
Introduction to Statistics
Review of Statistics (SW Chapters 3)
Introduction to Econometrics
BUS173: Applied Statistics
Numerical Descriptive Measures
Data analysis and basic statistics
Business and Economics 7th Edition
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

1 A REVIEW OF QUME 232  The Statistical Analysis of Economic (and related) Data

2 Brief Overview of the Course

3 This course is about using data to measure causal effects.

4 In this course you will:

5 Types of Data – Cross Sectional  Cross-sectional data is a random sample  Each observation is a new individual, firm, etc. with information at a point in time  If the data is not a random sample, we have a sample-selection problem

6 Types of Data – Time Series  Time series data has a separate observation for each time period – e.g. stock prices  Since not a random sample, different problems to consider  Trends and seasonality will be important

7 Types of Data – Panel  Can pool random cross sections and treat similar to a normal cross section. Will just need to account for time differences.  Can follow the same random individual observations over time – known as panel data or longitudinal data

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed. 4-8 Summations  The  symbol is a shorthand notation for discussing sums of numbers.  It works just like the + sign you learned about in elementary school.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed. 4-9 Algebra of Summations

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Summations: A Useful Trick

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Double Summations  The “Secret” to Double Summations: keep a close eye on the subscripts.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Descriptive Statistics  How can we summarize a collection of numbers?  Mean: the arithmetic average. The mean is highly sensitive to a few large values (outliers).  Median: the midpoint of the data. The median is the number above which lie half the observed numbers and below which lie the other half. The median is not sensitive to outliers.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Descriptive Statistics (cont.)  Mode: the most frequently occurring value.  Variance: the mean squared deviation of a number from its own mean. The variance is a measure of the “spread” of the data.  Standard deviation: the square root of the variance. The standard deviation provides a measure of a typical deviation from the mean.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Descriptive Statistics (cont.)  Covariance: the covariance of two sets of numbers, X and Y, measures how much the two sets tend to “move together.” If Cov(X,Y)  0, then if X is above its mean, we would expect that Y would also be above its mean.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Descriptive Statistics (cont.)  Correlation Coefficient: the correlation coefficient between X and Y “norms” the covariance by the standard deviations of X and Y. You can think of this adjustment as a unit correction. The correlation coefficient will always fall between -1 and 1.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed A Quick Example

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed A Quick Example (cont.)

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed A Quick Example (cont.)

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Populations and Samples  Two uses for statistics:  Describe a set of numbers  Draw inferences from a set of numbers we observe to a larger population  The population is the underlying structure which we wish to study. Surveyors might want to relate 6000 randomly selected voters to all the voters in the United States. Macroeconomists might want to relate data about unemployment and inflation from 1958–2004 to the underlying process linking unemployment and inflation, to predict future realizations.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Populations and Samples (cont.)  We cannot observe the entire population.  Instead, we observe a sample drawn from the population of interest.  In the Monte Carlo demonstration from last time, an individual dataset was the sample and the Data Generating Process described the population.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Populations and Samples (cont.)  The descriptive statistics we use to describe data can also describe populations.  What is the mean income in the United States?  What is the variance of mortality rates across countries?  What is the covariance between gender and income?

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Populations and Samples (cont.)  In a sample, we know exactly the mean, variance, covariance, etc. We can calculate the sample statistics directly.  We must infer the statistics for the underlying population.  Means in populations are also called expectations.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Populations and Samples (cont.)  If the true mean income in the United States is , then we expect a simple random sample to have sample mean .  In practice, any given sample will also include some “sampling noise.” We will observe not , but  + .  If we have drawn our sample correctly, then on average the sampling error over many samples will be 0.  We write this as E(  ) = 0

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Expectations  Expectations are means over all possible samples (think “super” Monte Carlo).  Means are sums.  Therefore, expectations follow the same algebraic rules as sums.  See the Statistics Appendix for a formal definition of Expectations.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Algebra of Expectations  k is a constant.  E(k) = k  E(kY) = kE(Y)  E(k+Y) = k + E(Y)  E(Y+X) = E(Y) + E(X)  E(  Y i ) =  E(Y i ), where each Y i is a random variable.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Law of Iterated Expectations  The expected value of the expected value of Y conditional on X is the expected value of Y.  If we take expectations separately for each subpopulation (each value of X), and then take the expectation of this expectation, we get back the expectation for the whole population.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Variances  Population variances are also expectations.

Copyr ight © 2006 Pears on Addis on- Wesle y. All rights reserv ed Algebra of Variances  One value of independent observations is that Cov(Y i,Y j ) = 0, killing all the cross-terms in the variance of the sum.

29 Review of Probability and Statistics

30 The California Test Score Data Set

31 Initial look at the data: (You should already know how to interpret this table)  This table doesn’t tell us anything about the relationship between test scores and the STR.

32 Do districts with smaller classes have higher test scores? Scatterplot of test score v. student-teacher ratio What does this figure show?

33 We need to get some numerical evidence on whether districts with low STRs have higher test scores – but how?

34 Initial data analysis: Compare districts with “small” (STR < 20) and “large” (STR ≥ 20) class sizes: 1.Estimation of  = difference between group means 2.Test the hypothesis that  = 0 3.Construct a confidence interval for  Class SizeAverage score ( ) Standard deviation (s B Y B ) n Small Large

35 1. Estimation

36 2. Hypothesis testing

37 Compute the difference-of-means t-statistic:

38 3. Confidence interval

39 What comes next…

40 Review of Statistical Theory

41 (a) Population, random variable, and distribution

42 Population distribution of Y

43 (b) Moments of a population distribution: mean, variance, standard deviation, covariance, correlation

44 Moments, ctd.

45

46 so is the correlation… The covariance between Test Score and STR is negative:

47 The correlation coefficient is defined in terms of the covariance:

48 The correlation coefficient measures linear association

Sampling Statistical Inference Problems with Sampling: Selection bias Survivor bias Non-response bias Response bias 49

50 Distribution of Y 1,…, Y n under simple random sampling

51

52 Things we want to know about the sampling distribution:

53 Mean and variance of sampling distribution of, ctd.

54 The sampling distribution of when n is large

55 The Law of Large Numbers:

56 The Central Limit Theorem (CLT):

57

58

59 Calculating the p-value, ctd.

60 Calculating the p-value with  Y known:

61 Estimator of the variance of Y :

62 Computing the p-value with estimated:

63 What is the link between the p-value and the significance level?

64 At this point, you might be wondering,...

65 Comments on this recipe and the Student t-distribution

66 Comments on Student t distribution, ctd.

67 Comments on Student t distribution, ctd.

68 Comments on Student t distribution, ctd.

69 The Student-t distribution – summary

70

71 Confidence intervals, ctd.

72 Summary:

73 Let’s go back to the original policy question: