Download presentation
Presentation is loading. Please wait.
Published byClaud Todd Modified over 9 years ago
1
DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median 3.Mode 3.Measures of spread 1.Range 2.Variance 4.Simple Sampling 5.Example of Sampling by using EXCELL
2
2 1. Working with Data in Excel: Arithmetic
3
3 Use “Insert” then “Function” then “All” or “Statistical” to find an alphabetical list of functions 1. Summary Statistics in EXCELL (One Variable)
4
4 1. Summary Statistics in EXCELL Average
5
5 1. Summary Statistics in EXCELL (Median)
6
6 1. Summary Statistics in EXCELL (Standard Deviation)
7
7 1. Summary Statistics in EXCELL (Rand & RandBetween)
8
8 1. Summary Statistics in EXCELL (Sort )
9
FunctionDescription maxMaximum value meanAverage or mean value medianMedian value minSmallest value modeMost frequent value stdStandard deviation varVariance, which measures the spread or dispersion of the values 1. Summary Statistics in MATLAB
10
2. Distributions Continuous Probability Distributions Uniform Probability Distribution Normal Probability Distribution Exponential Probability Distribution f ( x ) x x Uniform x Normal x x Exponential
11
Uniform Probability Distribution where: a = smallest value the variable can assume b = largest value the variable can assume b = largest value the variable can assume f ( x ) = 1/( b – a ) for a < x < b f ( x ) = 1/( b – a ) for a < x < b = 0 elsewhere = 0 elsewhere f ( x ) = 1/( b – a ) for a < x < b f ( x ) = 1/( b – a ) for a < x < b = 0 elsewhere = 0 elsewhere n A random variable is uniformly distributed whenever the probability is proportional to the interval’s length. n The uniform probability density function is:
12
Var( x ) = ( b - a ) 2 /12 E( x ) = ( a + b )/2 Uniform Probability Distribution n Expected Value of x n Variance of x
13
The highest point on the normal curve is at the The highest point on the normal curve is at the mean, which is also the median and mode. mean, which is also the median and mode. The highest point on the normal curve is at the The highest point on the normal curve is at the mean, which is also the median and mode. mean, which is also the median and mode. Normal Probability Distribution n Characteristics x
14
Normal Probability Distribution n Characteristics -10020 The mean can be any numerical value: negative, The mean can be any numerical value: negative, zero, or positive. zero, or positive. The mean can be any numerical value: negative, The mean can be any numerical value: negative, zero, or positive. zero, or positive. x
15
3. Normal Probability Distribution n Characteristics = 15 = 25 The standard deviation determines the width of the curve: larger values result in wider, flatter curves. The standard deviation determines the width of the curve: larger values result in wider, flatter curves. x
16
n Converting to the Standard Normal Distribution Standard Normal Probability Distribution We can think of z as a measure of the number of standard deviations x is from .
17
3. Normal Probability Distribution n Characteristics x – 3 – 1 – 2 + 1 + 2 + 3 68.26% 95.44% 99.72%
18
4. Sampling and Sampling Distributions Sampling Distribution of Sampling Distribution of Introduction to Sampling Distributions Introduction to Sampling Distributions Point Estimation Point Estimation Simple Random Sampling Simple Random Sampling Other Sampling Methods Other Sampling Methods Sampling Distribution of Sampling Distribution of
19
4. Simple Random Sampling: Finite populations are often defined by lists such as: Organization membership roster Credit card account numbers Inventory product numbers n n A simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected.
20
s is the point estimator of the population standard s is the point estimator of the population standard deviation . deviation . s is the point estimator of the population standard s is the point estimator of the population standard deviation . deviation . In point estimation we use the data from the sample In point estimation we use the data from the sample to compute a value of a sample statistic that serves to compute a value of a sample statistic that serves as an estimate of a population parameter. as an estimate of a population parameter. In point estimation we use the data from the sample In point estimation we use the data from the sample to compute a value of a sample statistic that serves to compute a value of a sample statistic that serves as an estimate of a population parameter. as an estimate of a population parameter. 4. Point Estimation We refer to as the point estimator of the population We refer to as the point estimator of the population mean . mean . We refer to as the point estimator of the population We refer to as the point estimator of the population mean . mean . is the point estimator of the population proportion p. is the point estimator of the population proportion p.
21
Process of Statistical Inference The value of is used to make inferences about the value of . The sample data provide a value for the sample mean. A simple random sample of n elements is selected from the population. Population with mean = ? Sampling Distribution of
22
4. Simple Random Sampling The applicants were numbered, from 1 to 900, as their applications arrived. She decides a sample of 30 applicants will be used. Furthermore, the Director of Admissions must obtain estimates of the population parameters of interest for a meeting taking place in a few hours. Now suppose that the necessary data on the current year’s applicants were not yet entered in the college’s database. The population parameters of interest are the SAT scores and the percentage of students planning to live in dorms.
23
Taking a Sample of 30 Applicants Excel’s RAND function generates Excel’s RAND function generates random numbers between 0 and 1 random numbers between 0 and 1 Excel’s RAND function generates Excel’s RAND function generates random numbers between 0 and 1 random numbers between 0 and 1 4. Simple Random Sampling: Step 1: Assign a random number to each of the 900 applicants. applicants. Step 2: Select the 30 applicants corresponding to the 30 smallest random numbers. 30 smallest random numbers.
24
4. Using Excel to Select a Simple Random Sample n Excel Formula Worksheet
25
4. Using Excel to Select a Simple Random Sample n Excel Value Worksheet
26
n Put Random Numbers in Ascending Order 4. Using Excel to Select a Simple Random Sample Step 4 When the Sort dialog box appears: Choose Random Numbers in the Choose Random Numbers in the Sort by text box Sort by text box Choose Ascending Choose Ascending Click OK Click OK Step 3 Choose the Sort option Step 2 Select the Data menu Step 1 Select cells A2:A901
27
Using Excel to Select a Simple Random Sample n Excel Value Worksheet (Sorted)
28
as Point Estimator of as Point Estimator of n as Point Estimator of p Point Estimation Note: Different random numbers would have identified a different sample which would have resulted in different point estimates. s as Point Estimator of s as Point Estimator of
29
PopulationParameterPointEstimatorPointEstimateParameterValue = Population mean SAT score SAT score 990997 = Population std. deviation for deviation for SAT score SAT score 80 s = Sample std. s = Sample std. deviation for deviation for SAT score SAT score75.2 p = Population pro- portion wanting portion wanting campus housing campus housing.72.68 Summary of Point Estimates Obtained from a Simple Random Sample = Sample mean = Sample mean SAT score SAT score = Sample pro- = Sample pro- portion wanting portion wanting campus housing campus housing
30
Other Sampling Methods n Stratified Random Sampling n Cluster Sampling n Systematic Sampling n Convenience Sampling n Judgment Sampling
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.