INT Take INT Give Fumble Take Fumble Give Penalty YDS Pass YD/ATT Rush YT/ATT Completion % Rush YPG, Pass Yards PG 3 rd Down Conversions 4 th Down Conversions
Pearson Correlation A technique that determines the strength of a relationship between two variables. +1 indicates they are perfectly related in a positive linear sense. Example: Caloric intake increases, weight increases. -1 indicates they are perfectly related in a negative linear sense. Example: Car price goes down, as age goes up. “zero” indicates there is no correlation. Example: The number of Red Sox fans named Steve to the number of wins the Red Sox win have this year.
Correlation to Wins +1 0.21 INT Take -.43 INT Give -.18 Fumble Take -.24 Fumble Give.03 Penalty YDS.56 Pass YD/ATT.15 Rush YT/ATT.49 Completion %.48 Rush Yards PG.30 Pass Yards PG.55 3rd down conversions.04 4th down conversions
Merchandise Trends Bucket Months On Hand (COGS / Ending Inventory) Inventory Turn Markdowns On-hand / Not Sold Many of these can be evaluated as “whole store” or even more granular “by merchandise category”
Human Bucket Customer Survey Scores (total and by question) Turnover Tenure (Manager / Non-manager) Training/Certification Compliance Rates Workers Comp Rates Payroll to plan (LP and sales separately) Engagement Scores
LP Statistics LP Staff Internals Externals LP Productivity Technology EAS Activations POS Exceptions Voids Dummy SKU usage No receipt refunds Line item voids Compliance / Process Store Self Inspection Score LP Audit Score
Regression Overview Define Regression Analysis Everyday life examples Making the transition to Loss Prevention Running a regression to predict shrink Questions
Regression A method used to identify and measure the relationship between two or more variables In regression there is always one “dependent” variable, and one or more “independent” variables. The benefit of using regression, is that you can make reasonable estimates about expected results.
Regression in Everyday Life Lowest Mileage Highest Mileage USED CAR ADS YearMakeModel List PriceMiles 2007HondaAccord$20,59918,998 2007HondaAccord$18,49918,205 2007HondaAccord$17,49915,155 2007HondaAccord$17,49934,802 Notice that both vehicles are listed at $17,499.
Regression in Everyday Life It appears that at least one of the prices is too high. How can we determine what the correct price should be? We can pull sample data and run a regression analysis in Excel!
Regression in Everyday Life Pulling Sample Data The order of the data is important. In Excel Regression always put the dependent (price) variable to the left of the independent variables. The independent variables (age and miles) should be placed in the columns next to the dependent variable.
Regression in Everyday Life Start by selecting Tools on the top menu Then select Data Analysis…
Regression in Everyday Life The Data Analysis dialogue box will open. Scroll down in the dialogue box and select Regression.
Regression in Everyday Life The Regression dialogue box will open up. In the box Input Y, we will define the range of our dependent variable including the title. Price is in column B. Next, in the Input X box we will select the range for the independent variables. Age and mileage are in columns C and D. Finally. Check the Labels Box
Regression in Everyday Life Here we can see the “Multiple R” is.859. Like the Pearson Correlation coefficient, the closer to 1 this number is, the more accurate the estimations made below. The area that we want to focus on is right here.
Regression in Everyday Life The Honda Accord Coefficients Start with baseline price. Each year old subtract. Each 1K miles subtract. So how much should we expect to pay for a 2007 Honda Accord with no more than 20,000 miles on it?
Regression in Everyday Life A 2007 Honda Accord is 2 years old and has 20,000 miles on it. Per the Regression we should start with $20,100 as a base price. For each year old the vehicle is we should subtract - $964.67. In this case our vehicle is two years old which equates to = - $1,929 (2 * -$964.67). Finally for each 1,000 miles we should subtract -$28.77. For 20K miles we estimate on our vehicle, this would equate to - $575 (20 * -$28.77). Therefore a 2007 Honda Accord with 20,000 miles should cost us about $20,100 - $1,929 - $575 or $17,596.
Regression in Everyday Life What is it really worth? USED CAR ADS YearMakeModelList PriceMiles 2007HondaAccord$20,59918,998 2007HondaAccord$18,49918,205 2007HondaAccord$17,49915,155 2007HondaAccord$17,49934,802 $17,624 $17,647 $17,735 $17,170 FMV
How many games should a team win? Rushing YPG INT Give Pass YD/ATT 3 rd Down Conversions Defensive Sacks
Looking at the Multiple R we can see that the value is.870 which is very close to 1 and indicates that these five metrics combined have a strong correlation to victories. Predicting Wins Output Again we only want to focus here.
Predicting Wins Regression Pittsburgh won the Super Bowl. According to the regression how many wins should Pittsburgh have gotten based on the following information? How many should the Lions have won?
Build your own Shrink Predictor Report Update your information in these 5 columns
Shrink Predictor Tool From the Summary Output, Copy the highlighted cells and paste them into the Coefficient Updater Tab in the Shrink Predictor Workbook.
Here we will take the information from the Summary Output and paste it into the yellow boxes on the Coefficient Updater tab. Once you paste the information in the shaded boxes click on the Shrink Predictor tool tab.
When you click on the Shrink Predictor Tool tab you will see that the Predicted Shrink column is filled out. This means that the equation to predict shrink is now functional.
Equation based on Historical Data With Current Data - After
Shrink Predictor Tool The true value of a Shrink Predictor is to identify stores which are predicted to be materially higher than their past results or to identify outliers in the current model. Not to identify the highest shrink stores. In the sample data we provided there are 192 stores. Of the 192 stores, 42 had variances that were greater than the standard deviation of.60 or 21% of the sample. 79% of the stores fell within one deviation. In most cases, the more metrics you use, the lower the standard deviation and subsequently the more accurate your prediction will be.
Recap of Steps 1.Collect Historical Data (Monthly trickles, not annual) Test with old data (2007 or 2008) Run Pearson Correlation to select strongest metrics. 2.Run Regression Analysis on 4 strongest 2007 or 2008 metrics using the tool we provided. Establish Intercept, MultipleR and Coefficients 3.Apply Regression Results (Intercept, MultipleR and Coefficients) to current 2009 monthly metrics to predict future shrink using the tool provided.
Roll-out Selling point, “Predict versus React”. One version of the truth, one focus.
Additional Considerations Segment stores if diversity is pronounced, consider: Box size Box format Volume Risk Rating Brand Technology (CCTV / EAS) Major event (remodel, crisis event) Use the same methodology to create Risk Ratings on static data. (crime index, census info, 3YR shrink, etc.)
Questions Michael Sanders Shrinkage Control Analyst J.C. Penney Company, Inc. firstname.lastname@example.org To download the Shrink Predictor Tool, visit: http://www.rila.org/protection/resources/Documents/SHRINKPRE DICTORTOOL.xls http://www.rila.org/protection/resources/Documents/SHRINKPRE DICTORTOOL.xls To view the help guide for the Shrink Predictor Tool, visit: http://www.rila.org/protection/resources/Documents/SHRINKPRE DICTORTOOLHELPGUIDE.pdf http://www.rila.org/protection/resources/Documents/SHRINKPRE DICTORTOOLHELPGUIDE.pdf Google “Regression Analysis” or “Pearson Correlation” Search Excel Help for “Regression Analysis” or “Pearson Correlation” www.visualstatistics.net