# English math statistics data THE SCIENTIFIC METHOD knowledge.

## Presentation on theme: "English math statistics data THE SCIENTIFIC METHOD knowledge."— Presentation transcript:

english math statistics data THE SCIENTIFIC METHOD knowledge

ENGLISH TO MATH HYPOTHESIS IN ENGLISH: Revenues are related to the economy HYPOTHESIS IN MATH: Revenues (R) are related to income (Y), interest rates (I), prices (P), and time (T): R = a + b*Y + c*I + d*P + e*T Assumptions on coefficients: eg. b>0

CRITICAL ASSUMPTIONS REPRODUCIBILITY CORRECT SPECIFICATION ALL INFLUENCES THAT ARE NOT INCLUDED, HAVE NO EFFECT ALL INFLUENCES THAT ARE INCLUDED HAVE PRECISE, RIGID EFFECT CETERIS PARIBUS

ADVERTISING AND CHANGE IN MARKET SHARE Change in Market Share (%) Ad Spending(\$mil) Estimated Regression Line

MATH TO STATISTICS NULL HYPOTHESES: State the opposite of what you wish to prove and find a counterexample. CRITICAL VALUES: You reject the null hypothesis when you jump the hurdle (critical value)

CRITICAL ASSUMPTIONS CORRECT STATISTICAL METHOD CHOSEN (eg. Regression) STATIONARITY (NO TREND EFFECTS) LEAST SUM SQUARED ERROR IS THE APPROPRIATE CRITERION RANDOMNESS OF OUTSIDE INFLUENCES (No autocorrelation or heteroscedasticity) STATISTICAL DISCRIMINATION POSSIBLE (No Multicollinearity)

x- x x x x x x x x x FITTING THE REGRESSION LINE M.Share = a + b* (Advt. Spending) Advt. Spending M.Share }=b a={ =.858 +.2246 * (Advt. Spending)

ADVERTISING AND MARKET SHARE: CIGARETTES Market Share (%) Ad Spending(\$mil) MEAN REGRESSION LINE { UNexplained error explained error TOTAL error

R-SQUARED = EXPLAINED SUM SQUARED ERROR TOTAL SUM SQUARED ERROR FOR EXAMPLE: An R-squared value of.90 means that ninety percent of the variation in your dependent variable is explained by the independent variables.

F-statistic EXPLAINED MEAN SQUARE ERROR UNEXPLAINED MEAN SQ. ERROR Null Hypothesis: The dependent variable is not explained by a combination of all of the independent variables together. Go to F-tables (.05) to find the critical values for rejecting the null hypothesis

t-statistic The critical value tests the significance of each variable (rejects the null hypothesis on each variable). Null Hypothesis: The dependent variable is not related to the independent variable. Go to t-tables (.05) to find the critical values for rejecting the null hypothesis in a two-tail test. Go to the.10 column for one-tail tests.

x- x x x x x x x x x HETEROSCEDASTICITY } } LARGE ERROR AT THIS END } } SMALL ERROR ELSE- WHERE

HETEROSCEDASTIC PATTERNS OF ERROR · · · ·· ·· · · ······ · · · · ·· · · · · · · · · · · · · · ·· · · · · · · · · · · · · · · · · · · Scattered at one end Scattered in the middle Scattered at both ends

AUTOOCORRELATION POSITIVE AUTOCORRELATION NEGATIVE AUTOCORRELATION (eg. curvilinear pattern or other (eg. alternation above and below the nonlinear pattern) regression line) · · ·· · · · · · · · · · · · · · · · · · · · ·

DURBIN-WATSON TEST FOR AUTOCORRELATION POSITIVE AUTOCORRELATION NEGATIVE AUTOCORRELATION 0.72 1.74 2.00 2.26 3.28 4.00 | | | | | | | Reject the null hypo- thesis that there is no POSITIVE autocorre- lation Reject the null hypo- thesis that there is no NEGATIVE autocorre- lation Uncertain region for POSITIVE autocorre- lation Uncertain region for NEGATIVE autocorre- lation No PO- SI- TIVE auto- cor- re- lation No NE- GA- TIVE auto- cor- re- lation

STATISTICS TO DATA How is data defined and collected? Is the data consistently collected across all units? How should the data be transformed for your particular use?

DATA COLLECTION TIME SERIES: measures variation of a unit or variable over several time periods CROSS SECTION: measures variation during a given time period over several different units POOLED CROSS SECTION- TIME SERIES: measures variation of different units over different time periods.

TIME SERIES TRANSFORMATIONS SAMPLE SIZE AGGREGATION OF TIME (YEAR? DAY AGGREGATION OF UNIT (FIRM, MARKET, INDUSTRY) SPECIAL EVENT (DUMMY VARIABLE) MATH TRANSFORMATIONS

LOGARITHMS INVERSE PERCENTAGE CHANGES INFLATION, SEASONALITY

STATISTICAL PROCEDURES REQUIRED FOR DIFFERENT KINDS OF PROBLEM SOLVING ARE THERE MANY EQUATIONS? DO THEY INVOLVE LINEAR FUNCTIONS? Is there more than one inde- pendent variable? Simultaneous equation esimation procedures should be used. Apply Multiple Linear Regression. Use Simple Linear Regression. Use NON linear regression or other NON linear estimation techniques. yes no yes no yes no