Introductory Statistics. Learning Objectives l Distinguish between different data types l Evaluate the central tendency of realistic business data l Evaluate.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Lesson 10: Linear Regression and Correlation
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Chapter 12 Simple Regression
Topics: Inferential Statistics
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
1 Pertemuan 13 Uji Koefisien Korelasi dan Regresi Matakuliah: A0392 – Statistik Ekonomi Tahun: 2006.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
SIMPLE LINEAR REGRESSION
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Linear Regression and Correlation Analysis
1 Basic statistics Week 10 Lecture 1. Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 2 Meanings.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
SIMPLE LINEAR REGRESSION
Pertemua 19 Regresi Linier
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 10 th Edition.
Correlation and Regression Analysis
Introduction to Regression Analysis, Chapter 13,
Hydrologic Statistics
Regression and Correlation Methods Judy Zhong Ph.D.
SIMPLE LINEAR REGRESSION
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Chapter 11 Simple Regression
MAT 254 – Probability and Statistics Sections 1,2 & Spring.
BIOSTAT - 2 The final averages for the last 200 students who took this course are Are you worried?
Numerical Descriptive Techniques
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
JDS Special Program: Pre-training1 Basic Statistics 01 Describing Data.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Linear Regression and Correlation Analysis. Regression Analysis Regression Analysis attempts to determine the strength of the relationship between one.
Applied Quantitative Analysis and Practices LECTURE#22 By Dr. Osman Sadiq Paracha.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Statistical Applications Binominal and Poisson’s Probability distributions E ( x ) =  =  xf ( x )
Variation This presentation should be read by students at home to be able to solve problems.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Lecture 10: Correlation and Regression Model.
Review Lecture 51 Tue, Dec 13, Chapter 1 Sections 1.1 – 1.4. Sections 1.1 – 1.4. Be familiar with the language and principles of hypothesis testing.
Correlation & Regression Analysis
Statistics for Managers Using Microsoft® Excel 5th Edition
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
BUSINESS MATHEMATICS & STATISTICS. Module 6 Correlation ( Lecture 28-29) Line Fitting ( Lectures 30-31) Time Series and Exponential Smoothing ( Lectures.
Chapter 12 Simple Regression Statistika.  Analisis regresi adalah analisis hubungan linear antar 2 variabel random yang mempunyai hub linear,  Variabel.
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
Chapter 13 Simple Linear Regression
Statistics for Managers using Microsoft Excel 3rd Edition
Linear Regression and Correlation Analysis
Chapter 5 STATISTICS (PART 4).
Simple Linear Regression
PENGOLAHAN DAN PENYAJIAN
BUS173: Applied Statistics
Product moment correlation
SIMPLE LINEAR REGRESSION
Introductory Statistics
Presentation transcript:

Introductory Statistics

Learning Objectives l Distinguish between different data types l Evaluate the central tendency of realistic business data l Evaluate the dispersion of data l Evaluate test statistics l Use a test statistic to formulate a business decisions using regression analysis After the session the students should be able to:

Types of data Discrete (A variable controlled by a fixed set of values) Continuous data (A variable measured on a continuous scale ) These data may be collected (ungrouped) and then grouped together in particular form so that can be easily inspected But how would we collect data?

Simple random sampling Stratified sampling Cluster sampling Quota sampling Systematic sampling Mechanical sampling Convenience sampling Sampling Techniques

Frequency distributions The following are data of ages of a sample of ages managers How could we represent these data effectively?

Scattering the data Scatter Diagrams Bar Diagrams

The histogram We could group the data into convenient class intervals thus and plot these to produce a histogramplot these to produce a histogram What measures of the central tendency do we have

Measures of the central tendency Mode The maximum value of the distribution e.g. the most occurring value (in reality this can be evaluated using a standard formula Median The central value of a set of data or a distribution. Can be evaluated using a standard method of using the CDF Arithmetic mean The central value assuming the data are distributed in accordance to an arithmetic progression Geometric mean The central value assuming the data are distributed according to a geometric progression

The mode For our data this occurs between (the modal range) The construction shown can be employed to home in on the exact value Or the formula: where L=lower boundary, l=lower freq diff, u=upper freq diff & c=the class boundary width

The mode Here, for our data L=29.5, l=5, u=1and the class boundary width c=10

The Median For our data we could evaluate this quantity two fold Approximate using by plotting the cumulative frequency diagramcumulative frequency diagram Via logical inference

Measures of Dispersion The range Largest value minus Smallest value Variance Mean Square variation from the mean Standard Deviation Square root of the variance NOTE:

Use of Computer packages Example: Given the following data use a spreadsheet to produce a grouped histogram using 9 bins also produce a CFD. Hence or otherwise evaluate:spreadsheet a) Three measures of the central tendency and, b) Three measures of the dispersion

Decision Processes This is all very well and good however, how does this allow us to make research and managerial & research decisions? To answer this we need to consider the pattern of the data, thus:

Many sets of data adhere to the normal distribution. The most important distribution of them all It is pretty much this property that allows us to obtain (research) management decisions The normal distribution is usually written N(μ,σ 2 ); with μ the population mean and σ 2 the variance The Normal distribution

Properties of N(μ,σ 2 ) For any normal curve with mean mu and standard deviation sigma: 68 percent of the observations fall within one standard deviation sigma of the mean. 95 percent of observation fall within 2 standard deviations percent of observations fall within 3 standard deviations of the mean.

The Z-Score This is formula that allows us to evaluate the probability of an event if we know that a particular population is normally distributed normally distributed N(48,12), find the probability that some value of X<20. Example: If a population is N(48,12), find the probability that some value of X<20.

Solution Protocol 1. Establish hypothesis 2. Evaluate the Z- score 3. Sketch the distribution 4. Evaluate probability probability p

Spreadsheet Solution Protocol 1. Establish hypothesis 2. Use normal distribution functionfunction 3. Perform Check i.e. use Z-function

Exercise Example: Using a z score If a population is N(111, ), find the probability that some value of 100 <X<150.

Exercise Using a z score and given that the population is N(37, ), find the probability that some value of X>150.

Samples If we are using a sample of values as a consequence of the central limit theorem the z score will change, thus

The mean expenditure per customer at a tire store is £60 and the sd £6. It is known that the nominal customer per day is 40. A new product costs £64, what is the probability of selling such a product per customer Example

Try one In a store, the average number of shoppers is 448, with an sd of 21. What is the probability that 49 shopping hours have a mean between 441 and446.

Regression & Correlation analysis A scatter diagram can be used to show the relationship between two variablesscatter diagram Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation Correlation is only concerned with strength of the relationship No causal effect is implied with correlation Scatter diagrams were presented in the last sessions As was Correlation

Regression & Correlation analysis A scatter diagram can be used to show the relationship between two variables Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the relationship No causal effect is implied with correlation Scatter diagrams were presented in the last sessions As was Correlation

Introduction to Regression Analysis oRegression analysis is used to:  Predict the value of a dependent variable based on the value of at least one independent variable  Explain the impact of changes in an independent variable on the dependent variable oDependent variable: the variable we wish to predict or explain oIndependent variable: the variable used to explain the dependent variable

Simple Linear Regression Model o Only one independent variable, X o Relationship between X and Y is described by a linear function o Changes in Y are assumed to be caused by changes in X

Types of Relationships Y X Y X Y Y X X Linear relationshipsCurvilinear relationships

Types of relationships cont… Y X Y X Y Y X X Strong relationships Weak relationships

Types of Relationships Y X Y X No relationship

The regression model Linear component Population Y intercept Population Slope Coefficient Random Error term Dependent Variable Independent Variable Random Error component

The regression model Random Error for this X i value Y X Observed Value of Y for X i Predicted Value of Y for X i XiXi Slope = β 1 Intercept = β 0 εiεi

The Least Squares approach b 0 and b 1 are obtained by finding the values of b 0 and b 1 that minimize the sum of the squared differences between Y and : Rendering: The proof of these requires the calculusproof

Regression Formulae Thus the formulae can be summarized as: Where:

Regression Example An estate agent wishes to find the relationship between the house prices and size, it is suspected that a linear relationship exists between the house price (the dependent variable Y) and the house size in square metres (the independent variable X). Using linear regression, find the relationship and make a prediction of a house price measuring 200m2. The following data have been collected by the estate agent.

Regression data House Price in £k (Y) Area in m sqr (X)

Regression Solution It is usual to set up a table of results, using an appropriate Excel spreadsheetspreadsheet

Regression Solution Cont… Now we simply apply the formulae as follows, first the regression coefficient, i.e. the gradientformulae

Regression Solution Cont… Then we evaluate the regression constant There are various computer methods available which do these calculations for you these are detailed in the handoutcomputer methods handout

Regression computer solution There a three methods to evaluate the Regression coefficient and constant using an Excel spreadsheet. These being: Graphical Calculation Functions

Regression computer solution Cont… This is an example of the graphical method, which is required for a pass grade in the forthcoming assignment! If you want higher grades however you will have to check these answers using the other two methods shown in the handouthandout

Summary l Distinguish between different data types l Evaluate the central tendency of realistic business data l Evaluate the dispersion of data l Evaluate test statistics l Use a test statistic to formulate a business decisions using regression analysis Have we met out learning objectives? Specifically are you able to: