Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Arbitrage Ying Chen, Leonardo Bachega Yandong Guo, Xing Liu March, 2010.

Similar presentations


Presentation on theme: "Statistical Arbitrage Ying Chen, Leonardo Bachega Yandong Guo, Xing Liu March, 2010."— Presentation transcript:

1 Statistical Arbitrage Ying Chen, Leonardo Bachega Yandong Guo, Xing Liu March, 2010

2 Outline Overview of the project Implement issues Data adjustment mistakes Stocks classification Future work

3 Framework Raw Historical Data From WRDS PCA Eigenportfolios PCA Eigenportfolios Residuals as increments of AR process Compute S-scores ETFs for industry sectors ETFs for industry sectors Signal trade orders Market model 60-day returns Residual process model Current stock prices Market model 252-day returns Adjusted Stock price Series + indices Data pre-processing (python scripts) Back-testing simulations (matlab scripts)

4 Implementation Issues Delist tomorrow Criteria: detect tomorrow’s outstanding shares In the portfolio, close transaction Not in the portfolio, not consider trading but still consider PCA calculating Today’s price == 0 in the middle Not consider PCA calculating and trading In the portfolio, keep it

5 Implementation Issues (Cont’d) Market Cap < 1B If already in the portfolio, keep it and consider trading No, not consider PCA calculating and trading Stocks picked to calculate Eigenportfolio Today’s price != 0 Previous 252 days have nonzero prices Market Cap > 1B or already in the portfolio

6

7 Data Adjustment Mistakes Dividend adjustment DATEPRCSHROUTDIVAMTAdjusted PriceYahoo Adjusted 2008100910.0857428.0.988514.07 200810108.77145887.0.338553.54 200810135.441458875.230.215.44 200810145.45145887.5.45015.45 200810155.14145887.5.14015.14 200810165.34145887.5.34015.34 200810175.33145887.5.33015.33 200810206.14145887.6.14016.14 200810215.96145887 5.96015.96

8 Data Adjustment Plan Dividend adjustment Split detection and adjustment using CFACPR and CFACSHR DATEPRCVOLSHROUTDIVAMTFACPRFACSHRCFACPRCFACSHR 200908070.262606623346...0.05 20090810-7.117611670-0.95 11 200908115.97519371167...11 200908126.349934061167...11 200908134.78261231167...11 200908144.2999274861167...11 200908174.0516581167...11 200908184.360421167...11 200908194.06100151167...11 200908203.797288051167...11

9 Stock Classification Using GIC (Global Industry Classification) in CRSP 10 Sectors, 24 Industry Groups, 67 Industries and 147 Sub-Industries XXXXXXXX Sector Industry Groups Industries Sub-Industry

10 Stock Classification (Cont’d)

11 PCA eigenportfolio Weights Normalization Basic principle Find the most important eigenvectors (15 in the paper) and normalize them by the corresponding standard deviations of each stock return

12 PCA algorithm by the author Suppose X is a nxp matrix including n samples and p features; Original algorithm: Calculate the Eigen-decomposition of the correlation matrix: The matrix Q consists of the Eigen-vectors of the correlation matrix

13 PCA discussion? Question Should the eigenvector be divided by the sigma, the sample standard deviation? Answer: No. (different from the paper)

14 PCA discussion The meaning of “risk factor” F F should represent the market overall performance. The behavior of F should act as the “market return” What can PCA do? PCA is mathematically defined as an orthogonal linear transformation that transforms the data to a new coordinate system such that the greatest variance by any projection of the data comes to lie on the first coordinate (called the first principal component), the second greatest variance on the second coordinate, and so on. PCA is theoretically the optimum transform for given data in least square terms.

15 PCA discussion Derivation Notations F =EX F :mxn matrix, represents the eigenportifolio E: mxp matrix, first m important eigenvectors X: pxn matrix, contains the stock return m: 15 in the paper n: the number of days, (samples) p: the number of stocks

16 PCA discussion Derivation The i-th row of the eigenportfolio The variation should be maximized under the constraint that to be maximized, then That is to say, the weighting factor should be the eigenvectors rather than the eigenvectors divided by the standard deviation. (The experiment is the same without dividing)

17 Experiment result Top 50 eigenvalues of the correlation matrix of market returns computed on May 1 2007 estimated using a 1-year window and a universe of 1590 stocks

18 Value of the first eigenvector

19 Future work Data adjustment Experiment on ETF Compare ETF with PCA Take into account Transaction fee, interest, dividend Volume

20 THANK YOU


Download ppt "Statistical Arbitrage Ying Chen, Leonardo Bachega Yandong Guo, Xing Liu March, 2010."

Similar presentations


Ads by Google