Baseball Statistics: Just for Fun!. 2/16 Issues, Theory, and Data Hypothesis Hypothesis Testing Home Run hitters: more strikeouts and four balls, and.

Slides:



Advertisements
Similar presentations
1 An Investigation into Regression Model using EVIEWS Prepared by: Sayed Hossain Lecturer for Economics Multimedia University Personal website:
Advertisements

Methodology- Framework Homeruns Batting Average Base on Balls Runs Batted In Strikeouts Errors Double Plays Fielding % Salaries Performance.
Correlation and Linear Regression.
10-3 Inferences.
Overview Motivation Data and Sources Methods Results Summary.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
BASEBALL By: Sam. RULES OF BASEBALL #1. A player can't remove his helmet while at bat or running the bases. A first offense draws a warning from the umpire.
T-Tests.
t-Tests Overview of t-Tests How a t-Test Works How a t-Test Works Single-Sample t Single-Sample t Independent Samples t Independent Samples t Paired.
T-Tests.
STATISTICS TUTORIAL Applied Research In Organizational Behavior By: Dr. Goli Sadri.
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
1 4. Multiple Regression I ECON 251 Research Methods.
Statistics for the Social Sciences Psychology 340 Spring 2005 Course Review.
Calculating Baseball Statistics Using Algebraic Formulas By E. W. Click the Baseball Bat to Begin.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Variance and covariance Sums of squares General linear models.
Lecture 5 Correlation and Regression
Leedy and Ormrod Ch. 11 Gray Ch. 14
Chapter 9 Two-Sample Tests Part II: Introduction to Hypothesis Testing Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social & Behavioral.
Baseball Taylor Brown. Baseball A baseball game is played by two teams who alternate between offense and defense. There are nine players on each side.
Inference for Regression BPS chapter 24 © 2006 W.H. Freeman and Company.
Correlation and Regression
BPS - 3rd Ed. Chapter 211 Inference for Regression.
History & Research Research Methods Unit 1 / Learning Goal 2.
A Comparison Between the Mets and the Yankees Many baseball fans criticize the New York Yankees for “buying” the best players in Major League Baseball.
Bivariate Regression Analysis The most useful means of discerning causality and significance of variables.
Statistics and Quantitative Analysis U4320 Segment 12: Extension of Multiple Regression Analysis Prof. Sharyn O’Halloran.
Hitting One Out of the Park Presentation by: Richie Veihl Derek Monroe.
Research Project Statistical Analysis. What type of statistical analysis will I use to analyze my data? SEM (does not tell you level of significance)
Economics 173 Business Statistics Lecture 20 Fall, 2001© Professor J. Petry
Warm-up Ch.11 Inference for Linear Regression Day 2 1. Which of the following are true statements? I. By the Law of Large Numbers, the mean of a random.
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
By Alexander Reichert, Robert Miller, and Sean Wasserman For the Love of the Game?
PY 603 – Advanced Statistics II TR 12:30-1:45pm 232 Gordon Palmer Hall Jamie DeCoster.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Regression Models w/ 2 Quant Variables Sources of data for this model Variations of this model Main effects version of the model –Interpreting the regression.
More statistics notes!. Syllabus notes! (The number corresponds to the actual IB numbered syllabus.) Put the number down from the syllabus and then paraphrase.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Methods and Applications CHAPTER 15 ANOVA : Testing for Differences among Many Samples, and Much.
The Correlational Research Strategy Chapter 12. Correlational Research The goal of correlational research is to describe the relationship between variables.
A baseball/softball game is played by two teams who alternate between offense and defense. There are nine players on each side. The goal is to score more.
ANOVA, Regression and Multiple Regression March
Research Methods and Data Analysis in Psychology Spring 2015 Kyle Stephenson.
Chapter 16 Multiple Regression and Correlation
The population in a statistical study is the entire group of individuals about which we want information The population is the group we want to study.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
ANalysis Of VAriance (ANOVA) Used for continuous outcomes with a nominal exposure with three or more categories (groups) Result of test is F statistic.
Chapter 15 The Elaboration Model Key Terms. Elaboration model A logical model for understanding the relationship between two methods by controlling for.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Chapter 22 Inferential Data Analysis: Part 2 PowerPoint presentation developed by: Jennifer L. Bellamy & Sarah E. Bledsoe.
Section Copyright © 2015, 2011, 2008 Pearson Education, Inc. Lecture Slides Essentials of Statistics 5 th Edition and the Triola Statistics Series.
By Adam Rothstein and Jesse Cox. Project description We are going to examine what characteristics, extranalities, and other influences are statistically.
The statistics behind the game
Section 11.2 Day 5.
video time
STAT 250 Dr. Kari Lock Morgan
QM222 A1 How to proceed next in your project Multicollinearity
The statistics behind the game
Introduction to Statistics
Lecture Slides Elementary Statistics Twelfth Edition
Descriptive statistics
Particle Impulse-Momentum: Example Problem 1
Lecture Slides Elementary Statistics Twelfth Edition
Science Fair – Baseball
Do Revenues Effect Success Among Professional Sports Teams?
Canonical Correlation Analysis
Field of Dreamers - Official Scoring Guide
Presentation transcript:

Baseball Statistics: Just for Fun!

2/16 Issues, Theory, and Data Hypothesis Hypothesis Testing Home Run hitters: more strikeouts and four balls, and less steals? Data collection Korea Baseball Organization and US Major League Home Pages Model y1=#strikeouts,y2=#steals,y3=#4Bs, x=#HRs. Regress y on constant, x. Test the statistical significance of regression slopes using t-tests.

3/16 2. Data Collection KBO US Major League Baseball

4/16 3. Model I (#strike outs) =  1 +  1 (#HRs) +  (#strike outs) =  1 +  1 (#HRs) + 

5/16 3. Model II (#steals made) =  2 +  2 (#HRs) +  (#steals made) =  2 +  2 (#HRs) +  (#steals attempted) =  3 +  3 (#HRs) +  (#steals attempted) =  3 +  3 (#HRs) + 

6/16 3. Model III (# four balls) =  4 +  4 (#HRs) +  (# four balls) =  4 +  4 (#HRs) + 

7/16 4. Hypothesis Testing  t-test on   1 = 0.84 t -value = 2.89  1 = ??  1 = ??  4 = 0.51 t -value = 2.50  4 = ??  4 = ??  2 =  3 =  2 =  3 = t -value = t -value = t -value =-1.14  2,  3 = ??  2,  3 = ?? Insignificant Significant

8/16 4. Hypothesis Testing (1)HR hitters get more strike outs! (3)HR hitters pull out more four balls! (2) HR hitter does not well steal a base because of his big body. Insignificant

Wait a minute! To prevent “ spurious correlation ” between #HRs and #strike-outs, #steals, #4Balls, we need to control for the number of appearance at the batter box. Right!

10/16 Multiple Regression – control for “ #at bats ” -  Without “control for # at bats,” a hitter with more appearances would record a higher number in each category than others, generating “spurious correlation between any pair of variables among #HRs, #strike-outs, #steals, and #four balls.  Two ways of control for # at batter box 1.Use a subsample of hitters who appeared more than Use “# at bats” as a control variable in multiple regression.

11/16 Model I (extended) (#strike outs) =  1 +  1 (#HRs) +  2 (#at bats) (#strike outs) =  1 +  1 (#HRs) +  2 (#at bats)

12/16 Results  1 = 0.89 (2.88)  1 = 0.89 (2.88)  (-0.49)  2 = (-0.49)  1 = 0.84 (2.89)  1 = 0.84 (2.89)  1 = 2.40 (11.64)  1 = 2.40 (11.64)  1 = 0.63 (3.11)  1 = 0.63 (3.11)  (12.53)  2 = 0.14 (12.53) using entire sample using sub-sample

13/16 When using a sub-sample which is already rather homogeneous in terms of number at bats, it doesn’t make much diference whether you control for # at bats or not. However, when using the entire sample which comprises of hitters vastly differing in terms of number at bats, control for # at bats does matter. In this entire sample, you would get distorted results if you do not control for # at bats. Interpretation  1 = 0.89 (2.88)  1 = 0.89 (2.88)  (-0.49)  2 = (-0.49)  1 = 0.84 (2.89)  1 = 0.84 (2.89)  1 = 2.40 (11.64)  1 = 2.40 (11.64)  1 = 0.63 (3.11)  1 = 0.63 (3.11)  (12.53)  2 = 0.14 (12.53) entire sample sub-sample

14/16 Model II (extended) (#4Balls) =  1 +  1 (#HRs) +  2 (#at bats) (#4Balls) =  1 +  1 (#HRs) +  2 (#at bats)

15/16 Results  1 = 0.34 (1.71)  1 = 0.34 (1.71)  (2.77)  2 = 0.12 (2.77)  1 = 0.51 (2.50)  1 = 0.51 (2.50)  1 = 1.32 (11.01)  1 = 1.32 (11.01)  1 = 0.33 (2.73)  1 = 0.33 (2.73)  2 (11.51)  2 = 0.07 (11.51) entire sample sub-sample

The End Was it fun?