Presentation on theme: "S TATISTICS IN PROFESSIONAL SOCCER A comparison of styles among the top professional soccer teams in the world."— Presentation transcript:
S TATISTICS IN PROFESSIONAL SOCCER A comparison of styles among the top professional soccer teams in the world
INTRODUCING THE DATA Statistical revolution in other sports Why not in soccer? Statistics kept during soccer games Individual vs. Team Statistics Goals for and goals against for winners of EPL, La Liga, MLS, and Serie A for past 6 seasons 24 data points Nonparametric vs. Parametric methods
T HE DATA CountryYearProportion GFProportion GAGFPGGAPG Spain200666/3840/ Italy200680/3834/ USA200652/3238/ England200683/3827/ Spain200784/3836/ Italy200769/3826/ USA200756/3034/ England200780/3822/ Spain /3835/ Italy200870/3832/ USA200850/3036/ England200868/3824/ Spain200998/3824/ Italy200975/3834/ USA200941/3031/ England /3832/ Spain201095/3821/ Italy201065/3824/ USA201044/3026/ England201078/3837/ Spain /3832/ Italy201168/3820/ USA201148/3428/ England201193/3829/
E XPLANATION OF S TYLES Italy Catenaccio Spain Tiki-Taka USA Not developed their own style England International mix of styles
T ESTING FOR GENERAL DIFFERENCES BETWEEN LEAGUES (K RUSKAL -W ALLIS ) Model Y ij = Ɵ +τ j + ij, i=1..6, j=1..4 H o : τ 1 =τ 2 =τ 3 =τ 4 H a : Not all τ j ’s are equal. H = DF = 3 P = Conclusion
P ARAMETRIC E QUIVALENT : O NE -W AY A NOVA Model Y ij =m+a i +e ij i=1..6,j=1..4 H 0 : µ 1 = µ 2 = µ 3 = µ 4 H a : Not all µ are equal. P-value F=9.75, df=3,20
T ESTING FOR N ORMALITY
A NOTHER W AY OF T ESTING V ARIANCES Descriptive Statistics: GFPG Variable Country StDev GFPG England Italy Spain USA
J ONCKHEERE -T ERPSTRA T EST F OR O RDERED A LTERNATIVES H o : τ 1 =τ 2 =τ 3 =τ 4 H a : τ 4 ≤ τ 3 ≤ τ 2 ≤ τ 1 Why did I choose this alternative? J = 189, p-value = 1.549e-05 There is significant evidence of differences in goals for per game across countries Which countries differ?
M ULTIPLE C OMPARISONS Decide τ u ≠τ v if abs(W * uv )≥3.633, which is based on q α, a large sample approximation For Spain vs. USA, I got the following answer: (56-39)/4.415=3.90 England vs. USA: (56-39)/4.415=3.90 These were the only two significant differences
T UKEY ’ S M ETHOD Grouping Information Using Tukey Method Country N Mean Grouping Spain A England A B Italy B C USA C
C OMPARING GOALS FOR AND G OALS A GAINST : H OW DO THE T OP T EAMS W IN ?
N ONPARAMETRIC C ORRELATION T ECHNIQUES Kendall's rank correlation tau data: Goalsf and Goalsa z = , p-value = Sample estimates: tau
A CCOUNTING FOR R ECENT T RENDS (G ENERAL L INEAR M ODEL ) Model Y i =β 0 + β 1 X i1 + β 2 X i2 +E i Estimate Std. Error t value Pr(>|t|) (Intercept) Countries e-05 *** Year No predictive linear relationship between Year and GFPG