# Tutorial Financial Econometrics/Statistics

## Presentation on theme: "Tutorial Financial Econometrics/Statistics"— Presentation transcript:

Tutorial Financial Econometrics/Statistics
2005 SAMSI program on Financial Mathematics, Statistics, and Econometrics

Goal

At the index level

Part I: Modeling ... in which we see what basic properties of stock prices/indices we want to capture

Contents Returns and their (static) properties Pricing models
Time series properties of returns

Why returns? Prices are generally found to be non-stationary
Makes life difficult (or simpler...) Traditional statistics prefers stationary data Returns are found to be stationary

Which returns? Two type of returns can be defined Discrete compounding
Continuous compounding

Discrete compounding If you make 10% on half of your money and 5% on the other half, you have in total 7.5% Discrete compounding is additive over portfolio formation

Continuous compounding
If you made 3% during the first half year and 2% during the second part of the year, you made (exactly) 5% in total Continuous compounding is additive over time

Empirical properties of returns
Mean St.dev. Annualized volatility Skewness Kurtosis Min Max IBM -0.0% 2.46% 39.03% -23.51 -138% 12.4% (corr) 0.0% 1.64% 26.02% -0.28 15.56 -26.1% S&P 0.95% 15.01% -1.4 39.86 -22.9% 8.7% Data period: July December 2004; daily frequency

Stylized facts Expected returns difficult to assess
What’s the ‘equity premium’? Index volatility < individual stock volatility Negative skewness Crash risk Large kurtosis Fat tails (thus EVT analysis?)

Pricing models Finance considers the final value of an asset to be ‘known’ as a random variable , that is In such a setting, finding the price of an asset is equivalent to finding its expected return:

Pricing models 2 As a result, pricing models model expected returns ... ... in terms of known quantities or a few ‘almost known’ quantities

Capital Asset Pricing Model
One of the best known pricing models The theorem/model states

Black-Scholes Also Black-Scholes is a pricing model
(Exact) contemporaneous relation between asset prices/returns

Time series properties of returns
Traditionally model fitting exercise without much finance mostly univariate time series and, thus, less scope for tor the ‘traditional’ cross-sectional pricing models lately more finance theory is integrated Focuses on the dynamics/dependence in returns

Random walk hypothesis
Standard paradigm in the Prices follow a random walk Returns are i.i.d. Normality often imposed as well Compare Black-Scholes assumptions

Box-Jenkins analysis

Linear time series analysis
Box-Jenkins analysis generally identifies a white noise This has been taken long as support for the random walk hypothesis Recent developments Some autocorrelation effects in ‘momentum’ Some (linear) predictability Largely academic discussion

Higher moments and risk

Risk predictability There is strong evidence for autocorrelation in squared returns also holds for other powers ‘volatility clustering’ While direction of change is difficult to predict, (absolute) size of change is risk is predictable

The ARCH model First model to capture this effect
No mean effects for simplicity ARCH in mean

ARCH properties Uncorrelated returns Correlated squared returns
martingale difference returns Correlated squared returns with limited set of possible patterns Symmetric distribution if innovations are symmetric Fat tailed distribution, even if innovations are not

The GARCH model Generalized ARCH Beware of time indices ...

GARCH model Parsimonious way to describe various correlation patterns
for squared returns Higher-order extension trivial Math-stat analysis not that trivial See inference section later

Stochastic volatility models
Use latent volatility process

Stochastic volatility models
Also SV models lead to volatility clustering Leverage Negative innovation correlation means that volatility increases and price decreases go together Negative return/volatility correlation (One) structural story: default risk

Continuous time modeling
Mathematical finance uses continuous time, mainly for ‘simplicity’ Compare asymptotic statistics as approximation theory Empirical finance (at least originally) focused on discrete time models

Consistency The volatility clustering and other empirical evidence is consistent with appropriate continuous time models A simple continuous time stochastic volatility model

Approximation theory There is a large literature that deals with the approximation of continuous time stochastic volatility models with discrete time models Important applications Inference Simulation Pricing

Other asset classes So far we only discussed stock(indices)
Stock derivatives can be studied using a derivative pricing models Financial econometrics also deals with many other asset classes Term structure (including credit risk) Commodities Mutual funds Energy markets ...

Term structure modeling
Model a complete curve at a single point in time There exist models in discrete/continuous time descriptive/pricing for standard interest rates/derivatives ...

Part 2: Inference

Contents Parametric inference for ARCH-type models
Rank based inference

Analogy principle The classical approach to estimation is based on the analogy principle if you want to estimate an expectation, take an average if you want to estimate a probability, take a frequency ...

Moment estimation (GMM)
Consider an ARCH-type model We suppose that can be calculated on the basis of observations if is known Moment condition

Moment estimation - 2 The estimator now is taken to solve
In case of “underidentification”: use instruments In case of “overidentification”: minimize distance-to-zero

Likelihood estimation
In case the density of the innovations is known, say it is , one can write down the density/likelihood of observed returns Estimator: maximize this

Doing the math ... Maximizing the log-likelihood boils down to solving
with

Efficiency consideration
Which of the above estimators is “better”? Analysis using Hájek-Le Cam theory of asymptotic statistics Approximate complicated statistical experiment with very simple ones Something which works well in the approximating experiment, will also do well in the original one

Quasi MLE In order for maximum likelihood to work, one needs the density of the innovations If this is not know, one can guess a density (e.g., the normal) This is known as ML under non-standard conditions (Huber) Quasi maximum likelihood Pseudo maximum likelihood

Will it work? For ARCH-type models, postulating the Gaussian density can be shown to lead to consistent estimates There is a large theory on when this works or not We say “for ARCH-type models the Gaussian distribution has the QMLE property”

The QMLE pitfall One often sees people referring to Gaussian MLE
Then, they remark that we know financial innovations are fat-tailed ... ... and they switch to t-distributions The t-distribution does not possess the QMLE property (but, see later)

How to deal with SV-models?
The SV models look the same But now, is a latent process and hence not observed Likelihood estimation still works “in principle”, but unobserved variances have to be integrated out

Inference for continuous time models
Continuous time inference can, in theory, be based on continuous record observations discretely sampled observations Essentially all known approaches are based on approximating discrete time models

... in which we discuss the main ideas of rank based inference

The statistical model Consider a model where ‘somewhere’ there
exist i.i.d. random errors The observations are The parameter of interest is some We denote the density of the errors by

Formal model We have an outcome space , with the number of observations and the dimension of Take standard Borel sigma-fields Model for sample size : Asymptotics refer to

Example: Linear regression
Linear regression model (with observations ) Innovation density and cdf

Example ARCH(1) Consider the standard ARCH(1) model
Innovation density and cdf

Maintained hypothesis
For given and sample size , the innovations can be calculated from the observations For cross-sectional models one may even often write Latent variable (e.g., SV) models ...

Innovation ranks The ranks are the ranks of the innovations
We also write for the ranks of the innovations based on a value for the parameter of interest Ranks of observations are generally not very useful

Basic properties The distribution does not depend on nor on
permutation of This is (fortunately) not true for at least ‘essentially’

Invariance Suppose we generate the innovations as transformation with i.i.d. standard uniform Now, the ranks are even invariant with respect to

Reconstruction For large sample size we have and, thus,

Rank based statistics The idea is to apply whatever procedure you have that uses innovations on the innovations reconstructed from the ranks This makes the procedure robust to distributional changes Efficiency loss due to ‘ ’?

Rank based autocorrelations
Time-series properties can be studied using rank based autocorrelations These can be interpreted as ‘standard’ autocorrelations rank based for given reference density and distribution free

Robustness An important property of rank based statistics is the distributional invariance As a result: a rank based estimator is consistent for any reference density All densities satisfy the QMLE property when using rank based inference

Limiting distribution
The limiting distribution of depends on both the chosen reference density and the actual underlying density The optimal choice for the reference density is the actual density How ‘efficient’ is this estimator? Semiparametrically efficient

Remark All procedures are distribution free with respect to the innovation density They are, clearly, not distribution free with respect to the parameter of interest

Signs and ranks

Why ranks? So far, we have been considering ‘completely’ unrestricted sets of innovation densities For this class of densities ranks are ‘maximal invariant’ This is crucial for proving semiparametric efficiency

Alternatives Alternative specifications may impose
zero-median innovations symmetric innovations zero-mean innovations This is generally a bad idea ...

Zero-median innovations
The maximal invariant now becomes the ranks and signs of the innovations The ideas remain the same, but for a more precise reconstruction Split sample of innovations in positive and negative part and treat those separately

But ranks are still ... Yes, the ranks are still invariant
... and the previous results go through But the efficiency bound has now changed and rank based procedures are no longer semiparametrically efficient ... but sign-and-rank based procedures are

Symmetric innovations
In the symmetric case, the signed-ranks become maximal invariant signs of the innovations ranks of the absolute values The reconstruction now becomes still more precise (and efficient)

Semiparametric efficiency

General result Using the maximal invariant to reconstitute the central sequence leads to semiparametrically efficient inference in the model for which this maximal invariant is derived In general use

Proof The proof is non-trivial, but some intuition can be given using tangent spaces