Advanced Quantitative Techniques

Slides:



Advertisements
Similar presentations
Appendix A. Descriptive Statistics Statistics used to organize and summarize data in a meaningful way.
Advertisements

Chapter 11- Confidence Intervals for Univariate Data Math 22 Introductory Statistics.
Topics: Inferential Statistics
Sampling Distributions
Review What you have learned in QA 128 Business Statistics I.
Data observation and Descriptive Statistics
Statistical inference Population - collection of all subjects or objects of interest (not necessarily people) Sample - subset of the population used to.
Continuous Probability Distributions A continuous random variable can assume any value in an interval on the real line or in a collection of intervals.
B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Quantitative Skills: Data Analysis
Introduction to Statistical Inference Chapter 11 Announcement: Read chapter 12 to page 299.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
Chapter 6: Random Errors in Chemical Analysis CHE 321: Quantitative Chemical Analysis Dr. Jerome Williams, Ph.D. Saint Leo University.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Determination of Sample Size: A Review of Statistical Theory
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Chapter 7 Statistical Inference: Estimating a Population Mean.
Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
LIS 570 Summarising and presenting data - Univariate analysis.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
CHAPTER 6: SAMPLING, SAMPLING DISTRIBUTIONS, AND ESTIMATION Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.
Sampling Distribution of the Sample Mean
Introduction to Marketing Research
CHAPTER 8 Estimating with Confidence
Advanced Quantitative Techniques
Chapter Six Summarizing and Comparing Data: Measures of Variation, Distribution of Means and the Standard Error of the Mean, and z Scores PowerPoint Presentation.
Confidence Intervals Topics: Essentials Inferential Statistics
Confidence Intervals and Sample Size
Estimating the Value of a Parameter Using Confidence Intervals
Chapter 9 Hypothesis Testing.
MATH-138 Elementary Statistics
Analysis and Empirical Results
ESTIMATION.
Chapter 6 Confidence Intervals.
APPROACHES TO QUANTITATIVE DATA ANALYSIS
LEARNING OUTCOMES After studying this chapter, you should be able to
Chapter 8: Estimating with Confidence
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Statistical Reasoning in Everyday Life
Density Curve A mathematical model for data, providing a way to describe an entire distribution with a single mathematical expression. An idealized description.
Quantitative Methods PSY302 Quiz 6 Confidence Intervals
An Introduction to Statistics
Confidence Intervals Topics: Essentials Inferential Statistics
CHAPTER 5 Fundamentals of Statistics
CHAPTER 8 Estimating with Confidence
CENTRAL LIMIT THEOREM specifies a theoretical distribution
Chapter 3 The Normal Distribution
Estimating the Value of a Parameter
Univariate Statistics
Chapter 8: Estimating with Confidence
CHAPTER 2 Modeling Distributions of Data
Chapter 8: Estimating with Confidence
CHAPTER 2 Modeling Distributions of Data
Sampling Distributions (§ )
CHAPTER 2 Modeling Distributions of Data
BUSINESS MARKET RESEARCH
Chapter 8: Estimating with Confidence
DESIGN OF EXPERIMENT (DOE)
Advanced Algebra Unit 1 Vocabulary
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Statistics Review (It’s not so scary).
How Confident Are You?.
Presentation transcript:

Advanced Quantitative Techniques Lab 2: Normality, Graphing Distributions, Confidence Intervals

Normal distribution

What are the Characteristics of a Normal Distribution? Unimodal Bell shaped Symmetric Mean = Mode = Median Skewness = 0 Kurtosis = 3 68 – 95 – 99.7 rule

If population has a Normal distribution 68.2% of dataset is within 1 standard deviation of the mean 95.4% of dataset is within 2 standard deviations of the mean 99.7% of dataset is within 3 standard deviations of the mean

More about Normal distribution Probability of any event is the area under the density curve. Total area under curve = 1 (collectively exhaustive)  Normal distributions are idealized description of data Total area is approximate; never precisely calculated because the line never touches x-axis.

Is population normal distributed? use calls_311.dta histogram POP2010, width (600) frequency normal

Is population normal distributed? sum POP2010, detail

Variance vs. Standard Deviation (σ2) Standard Deviation (σ) Average of squared differences from the mean Square root of the variance

Skewness is a measure of symmetry Where is the tail? Mean > Median Mean = Median Mean < Median STATA: Skewness > 0 Skewness = 0 Skewness < 0

Skewness

Kurtosis Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution. (Kurtosis > 3) (Kurtosis = 3) (Kurtosis < 3)

Example of Normal distribution use Lab_2_Data.16.dta histogram bwt, width (400) frequency normal

Example of Normal distribution sum bwt, detail

Sampling Population – a group that includes all the cases (individuals, objects, or groups) in which the researcher is interested. Sample – a relatively small subset from a population.

Sampling Random sample Stratified sample: divide the population into groups and draw a random sample from each group Cluster sample: group the population into small clusters, draws a simple random sample of clusters, and sample everything in the clusters

Sampling Parameter – A measure used to describe a population distribution. Statistic – A measure used to describe a sample distribution. Estimation – A process whereby we select a random sample from a population and use a sample statistic to estimate a population parameter.

Inference

Inferential Statistics We generally don’t know anything about the population distribution We have a sample of data from the population We assume that the average/mean is the most appropriate description of population (no more median because we assume normal distribution) The sample is to be random and representative (“large enough”)

Inferential Statistics What can we infer about the population based on a sample? From now on, we’re estimating the population mean (μ) with the sample mean ( ). We are no longer talking about individual behavior; we’re talking about average behavior

Distribution of Means Take a random sample over, and over, and over again (random means each data point has an equal chance of being chosen). You get many sample means Plot the sampling distribution of these means: you get a distribution of averages (not raw data points!)

Distribution of Means Sampling Distribution of Means: Frequency distribution (histogram) of the sample means, not of the data themselves. Distribution of all possible sample means **This is not the distribution of x** Freq If we sample randomly from a large enough population, the distribution of the averages of the data (not the population data!) is a bell curve (normal distribution). This is the case regardless of what the population distribution looks like.

Confidence Intervals The goal of calculating confidence intervals is to determine how sure we are that the true population mean, μ, is approximated by the sample mean .

Confidence Intervals Confidence Level – The likelihood, expressed as a percentage or a probability, that a specified interval will contain the population parameter. – 95% confidence level – there is a .95 probability that a specified interval DOES contain the population mean. – 99% confidence level – there is 1 chance out of 100 that the interval DOES NOT contain the population mean.

STATA: ci Command Open Stata and calls_311.dta . Ci means calls_per_thousand, level(90) Significance Level Sample Mean Sample Size Standard Error = Lower Bound of the CI Upper Bound of the CI

Build a 95% CI for 311 calls per thousand people. The default CI for the CI command in Stata is 95%. Precise Confident

Build a CI for Bronx calls/1,000pps that leaves a 10% chance of overestimation error. ci means calls_per_thousand if county=="005", level(80) Build a CI for Manhattan calls/1,000pps that leaves a 20% chance that the population mean is not captured by the interval. ci means calls_per_thousand if county=="061", level(80) Are they significantly different?

Confidence intervals in a Normal distribution