Machine Learning Week 4.

Slides:



Advertisements
Similar presentations
FINANCIAL TIME-SERIES ECONOMETRICS SUN LIJIAN Feb 23,2001.
Advertisements

Autocorrelation Functions and ARIMA Modelling
Dates for term tests Friday, February 07 Friday, March 07
ELEN 5346/4304 DSP and Filter Design Fall Lecture 15: Stochastic processes Instructor: Dr. Gleb V. Tcheslavski Contact:
Random Variables ECE460 Spring, 2012.
Model Building For ARIMA time series
Statistics review of basic probability and statistics.
Time Series Building 1. Model Identification
R. Werner Solar Terrestrial Influences Institute - BAS Time Series Analysis by means of inference statistical methods.
An Introduction to Time Series Ginger Davis VIGRE Computational Finance Seminar Rice University November 26, 2003.
STAT 497 APPLIED TIME SERIES ANALYSIS
1 Alberto Montanari University of Bologna Simulation of synthetic series through stochastic processes.
13 Introduction toTime-Series Analysis. What is in this Chapter? This chapter discusses –the basic time-series models: autoregressive (AR) and moving.
Review of Basic Probability and Statistics
Chapter 6 Continuous Random Variables and Probability Distributions
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Probability By Zhichun Li.
ARIMA Forecasting Lecture 7 and 8 - March 14-16, 2011
2. Random variables  Introduction  Distribution of a random variable  Distribution function properties  Discrete random variables  Point mass  Discrete.
Continuous Random Variables and Probability Distributions
Chapter 5 Continuous Random Variables and Probability Distributions
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
Review of Probability.
BOX JENKINS METHODOLOGY
Prof. SankarReview of Random Process1 Probability Sample Space (S) –Collection of all possible outcomes of a random experiment Sample Point –Each outcome.
ARMA models Gloria González-Rivera University of California, Riverside
Probability Theory and Random Processes
Time Series Analysis.
STAT 497 LECTURE NOTES 2.
Tch-prob1 Chap 3. Random Variables The outcome of a random experiment need not be a number. However, we are usually interested in some measurement or numeric.
Random Variables & Probability Distributions Outcomes of experiments are, in part, random E.g. Let X 7 be the gender of the 7 th randomly selected student.
Linear Stationary Processes. ARMA models. This lecture introduces the basic linear models for stationary processes. Considering only stationary processes.
Data analyses 2008 Lecture Last Lecture Basic statistics Testing Linear regression parameters Skill.
Elements of Stochastic Processes Lecture II
Week 21 Stochastic Process - Introduction Stochastic processes are processes that proceed randomly in time. Rather than consider fixed random variables.
Chapter 01 Probability and Stochastic Processes References: Wolff, Stochastic Modeling and the Theory of Queues, Chapter 1 Altiok, Performance Analysis.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
ارتباطات داده (883-40) فرآیندهای تصادفی نیمسال دوّم افشین همّت یار دانشکده مهندسی کامپیوتر 1.
K. Ensor, STAT Spring 2004 Memory characterization of a process How would the ACF behave for a process with no memory? What is a short memory series?
Chapter 01 Probability and Stochastic Processes References: Wolff, Stochastic Modeling and the Theory of Queues, Chapter 1 Altiok, Performance Analysis.
3.Analysis of asset price dynamics 3.1Introduction Price – continuous function yet sampled discretely (usually - equal spacing). Stochastic process – has.
MULTIVARIATE TIME SERIES & FORECASTING 1. 2 : autocovariance function of the individual time series.
Estimation Method of Moments (MM) Methods of Moment estimation is a general method where equations for estimating parameters are found by equating population.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Stochastic models - time series. Random process. an infinite collection of consistent distributions probabilities exist Random function. a family of random.
Continuous Random Variables and Probability Distributions
1 EE571 PART 3 Random Processes Huseyin Bilgekul Eeng571 Probability and astochastic Processes Department of Electrical and Electronic Engineering Eastern.
The Box-Jenkins (ARIMA) Methodology
Introduction to stochastic processes
STAT 497 LECTURE NOTES 3 STATIONARY TIME SERIES PROCESSES
Analysis of Financial Data Spring 2012 Lecture 4: Time Series Models - 1 Priyantha Wijayatunga Department of Statistics, Umeå University
Stochastic Process - Introduction
Multiple Random Variables and Joint Distributions
MECH 373 Instrumentation and Measurements
Covariance, stationarity & some useful operators
水分子不時受撞,跳格子(c.p. 車行) 投骰子 (最穩定) 股票 (價格是不穏定,但報酬過程略穩定) 地震的次數 (不穩定)
Financial Econometrics Lecture Notes 2
Appendix A: Probability Theory
Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.
Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models
Computational Data Analysis
Statistics 153 Review - Sept 30, 2008
Model Building For ARIMA time series
Stochastic models - time series.
Stochastic models - time series.
STOCHASTIC HYDROLOGY Random Processes
The Spectral Representation of Stationary Time Series
Lecturer Dr. Veronika Alhanaqtah
CH2 Time series.
Continuous Random Variables: Basics
Presentation transcript:

Machine Learning Week 4

Basic Probability Envision an experiment for which the result is unknown. The collection of all possible outcomes is called the sample space. A set of outcomes, or subset of the sample space, is called an event. A probability space is a three-tuple (W ,, Pr) where W is a sample space,  is a collection of events from the sample space and Pr is a probability law that assigns a number to each event in . For any events A and B, Pr must satsify: Pr() = 1 Pr(A)  0 Pr(AC) = 1 – Pr(A) Pr(A  B) = Pr(A) + Pr(B), if A  B = . If A and B are events in  with Pr(B)  0, the conditional probability of A given B is

Random Variables A random variable is “a number that you don’t know… yet” Discrete vs. Continuous Cumulative distribution function Density function Probability distribution (mass) function Joint distributions Conditional distributions Functions of random variables Moments of random variables Transforms and generating functions

Conditioning Frequently, the conditional distribution of Y given X is easier to find than the distribution of Y alone. If so, evaluate probabilities about Y using the conditional distribution along with the marginal distribution of X: Example: Draw 2 balls simultaneously from a jar containing four balls numbered 1, 2, 3 and 4. X = number on the first ball, Y = number on the second ball, Z = XY. What is Pr(Z > 5)? Key: Maybe easier to evaluate Z if X is known

Moments of Random Variables Expectation = “average” Variance = “volatility” Standard Deviation Coefficient of Variation

Linear Functions of Random Variables Covariance Correlation If X and Y are independent then

Bernoulli Distribution “Single coin flip” p = Pr(success) N = 1 if success, 0 otherwise Chapter 0

Binomial Distribution “n independent coin flips” p = Pr(success) N = # of successes Chapter 0

Geometric Distribution “independent coin flips” p = Pr(success) N = # of flips until (including) first success Chapter 0

Poisson Distribution “Occurrence of rare events”  = average rate of occurrence per period; N = # of events in an arbitrary period Chapter 0

Uniform Distribution X is equally likely to fall anywhere within interval (a,b) a b Chapter 0

Exponential Distribution X is nonnegative and it is most likely to fall near 0 Also memoryless; more on this later… Chapter 0

Normal Distribution X follows a “bell-shaped” density function From the central limit theorem, the distribution of the sum of independent and identically distributed random variables approaches a normal distribution as the number of summed random variables goes to infinity. Chapter 0

Stochastic Processes A stochastic process is a random variable that changes over time, or a sequence of numbers that you don’t know yet. Poisson process Continuous time Markov chains Chapter 0

Time Series Autoregressive (AR) and Moving Average (MA) models

Time Series A time series is a sequence of numerical data in which each item is associated with a particular instant in time In fact with the current progress in computer technology we have daily series on interest rates, the hourly "telerate" interest rate index, and stock prices by the minute (or even second).

An analysis of a single sequence of data is called univariate time-series analysis An analysis of several sets of data for the same sequence of time periods is called multivariatetime-series analysis or, more simply, multiple time-series analysis

Stochastic processes Time series are an example of a stochastic or random process A stochastic process is 'a statistical phenomenen that evolves in time according to probabilistic laws' Mathematically, a stochastic process is an indexed collection of random variables

Stochastic processes We are concerned only with processes indexed by time, either discrete time or continuous time processes such as

Continuous vs. Discrete We base our inference usually on a single observation or realization of the process over some period of time, say [0, T] (a continuous interval of time) or at a sequence of time points {0, 1, 2, . . . T}

Specification of a process A simpler approach is to only specify the moments—this is sufficient if all the joint distributions are normal The mean and variance functions are given by

Autocovariance Because the random variables comprising the process are not independent, we must also specify their covariance

Autocorrelation It is useful to standardize the autocovariance function (acvf) Consider stationary case only Use the autocorrelation function (acf)

Stationarity Inference is most easy, when a process is stationary—its distribution does not change over time This is strict stationarity A process is weakly stationary if its mean and autocovariance functions do not change over time

Weak stationarity The autocovariance depends only on the time difference or lag between the two time points involved

White noise This is a purely random process, a sequence of independent and identically distributed random variables Has constant mean and variance Also

Several Models for Time Series (1) a purely random process, (2) a random walk, (3) a movingaverage (MA) process, (4) an autoregressive (AR) process, (5) an autoregressive movingaverage (ARMA) process, and (6) an autoregressive integrated moving average (ARIMA)process.

Purely Random Process Auto-covariance function Auto-correlation function

Random Walk

Moving average processes Start with {Zt} being white noise or purely random, mean zero, s.d. Z {Xt} is a moving average process of order q (written MA(q)) if for some constants 0, 1, . . . q we have Usually 0 =1

Moving average processes The mean and variance are given by The process is weakly stationary because the mean is constant and the covariance does not depend on t

Moving average processes If the Zt's are normal then so is the process, and it is then strictly stationary The autocorrelation is

Moving average processes Note the autocorrelation cuts off at lag q For the MA(1) process with 0 = 1

Moving average processes In order to ensure there is a unique MA process for a given acf, we impose the condition of invertibility This ensures that when the process is written in series form, the series converges For the MA(1) process Xt = Zt + Zt - 1, the condition is ||< 1

Moving average processes For general processes introduce the backward shift operator B Then the MA(q) process is given by

Moving average processes The general condition for invertibility is that all the roots of the equation  lie outside the unit circle (have modulus less than one)

Autoregressive processes Assume {Zt} is purely random with mean zero and s.d. z Then the autoregressive process of order p or AR(p) process is

Autoregressive processes The first order autoregression is Xt = Xt - 1 + Zt Provided ||<1 it may be written as an infinite order MA process Using the backshift operator we have (1 – B)Xt = Zt

Autoregressive processes From the previous equation we have

Autoregressive processes Then E(Xt) = 0, and if ||<1

Autoregressive processes The AR(p) process can be written as

Autoregressive processes This is for for some 1, 2, . . . This gives Xt as an infinite MA process, so it has mean zero

Autoregressive processes Conditions are needed to ensure that various series converge, and hence that the variance exists, and the autocovariance can be defined Essentially these are requirements that the i become small quickly enough, for large i

Autoregressive processes The i may not be able to be found however. The alternative is to work with the i The acf is expressible in terms of the roots i, i=1,2, ...p of the auxiliary equation

Autoregressive processes Then a necessary and sufficient condition for stationarity is that for every i, |i|<1 An equivalent way of expressing this is that the roots of the equation must lie outside the unit circle

ARMA processes Combine AR and MA processes An ARMA process of order (p,q) is given by

ARMA processes Alternative expressions are possible using the backshift operator

ARMA processes An ARMA process can be written in pure MA or pure AR forms, the operators being possibly of infinite order Usually the mixed form requires fewer parameters

ARIMA processes General autoregressive integrated moving average processes are called ARIMA processes When differenced say d times, the process is an ARMA process Call the differenced process Wt. Then Wt is an ARMA process and

ARIMA processes Alternatively specify the process as This is an ARIMA process of order (p,d,q)

ARIMA processes The model for Xt is non-stationary because the AR operator on the left hand side has d roots on the unit circle d is often 1 Random walk is ARIMA(0,1,0) Can include seasonal terms—see later

Non-zero mean We have assumed that the mean is zero in the ARIMA models There are two alternatives mean correct all the Wt terms in the model incorporate a constant term in the model