A Generalized Model for Financial Time Series Representation and Prediction Author: Depei Bao Presenter: Liao Shu Acknowledgement: Some figures in this.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Foundations of Technical Analysis: Computational Algorithms, Statistical Inference, and Empirical Implementation Written by Andrew W.Lo, Harry Mamaysky,
Mean, Proportion, CLT Bootstrap
Saeed Ebrahimijam Spring 2013 Faculty of Business and Economics Department of Banking and Finance Doğu Akdeniz Üniversitesi FINA417.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
Dynamic Bayesian Networks (DBNs)
Technical Analysis EXTRA. Support & Resistance support is the price level through which a stock or market seldom falls Resistance, on the other hand,
Reinforcement Learning & Apprenticeship Learning Chenyi Chen.
On Systems with Limited Communication PhD Thesis Defense Jian Zou May 6, 2004.
Model In order to automatically identify technical indicators our model: quantizes real-time market trades in 15 second intervals scans ~40,000 data points.
Chapter 8 Exchange Rate Forecasting, Technical Analysis and Trading Rules.
Your First Step Intothe World Of Trading Understanding The Basics of Trading.
Simple Neural Nets For Pattern Classification
1 Development of Neural Network Algorithms for Predicting Trading Signals of Stock Market Indices Presented By: Nuha AlOjayan.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Simple Linear Regression
1 Abstract This paper presents a novel modification to the classical Competitive Learning (CL) by adding a dynamic branching mechanism to neural networks.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Efficient Estimation of Emission Probabilities in profile HMM By Virpi Ahola et al Reviewed By Alok Datar.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Mr. Perminous KAHOME, University of Nairobi, Nairobi, Kenya. Dr. Elisha T.O. OPIYO, SCI, University of Nairobi, Nairobi, Kenya. Prof. William OKELLO-ODONGO,
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Nonlinear Stochastic Programming by the Monte-Carlo method Lecture 4 Leonidas Sakalauskas Institute of Mathematics and Informatics Vilnius, Lithuania EURO.
Quantitative Trading Strategy based on Time Series Technical Analysis Group Member: Zhao Xia Jun Lorraine Wang Lu Xiao Zhang Le Yu.
OSCILLATORS. Oscillators can be defined as a price derivative Oscillators experience oscillations that permits to identify the volatility in the market.
With technical analysis, timing is the critical success factor. Technical Analysis serves to determine "when to buy or when to sell" shares. It is concerned.
The Stock Market Forecasting and Risk Management System using Genetic Programming Li Wang Ross School of Business Ann Arbor, MI
NEURAL NETWORKS FOR TECHNICAL ANALYSIS: A STUDY ON KLCI 授課教師:楊婉秀 報告人:李宗霖.
Saeed Ebrahimijam SPRING Faculty of Business and Economics Department of Banking and Finance Doğu Akdeniz Üniversitesi FINA417.
1 Assessment of Imprecise Reliability Using Efficient Probabilistic Reanalysis Farizal Efstratios Nikolaidis SAE 2007 World Congress.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Statistical inference: confidence intervals and hypothesis testing.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010.
October 15 th Common Cents Investment Group October, 2012 Agenda  FX on Investopedia  Today in the market  Technical Analysis – Part II  Pick.
Tracking with Unreliable Node Sequences Ziguo Zhong, Ting Zhu, Dan Wang and Tian He Computer Science and Engineering, University of Minnesota Infocom 2009.
Saeed Ebrahimijam Fall Faculty of Business and Economics Department of Banking and Finance Doğu Akdeniz Üniversitesi FINA417.
Federico M. Bandi and Jeffrey R. Russell University of Chicago, Graduate School of Business.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.
An Efficient Sequential Design for Sensitivity Experiments Yubin Tian School of Science, Beijing Institute of Technology.
Pattern Discovery of Fuzzy Time Series for Financial Prediction -IEEE Transaction of Knowledge and Data Engineering Presented by Hong Yancheng For COMP630P,
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
Lecture 10: Correlation and Regression Model.
Dr. Sudharman K. Jayaweera and Amila Kariyapperuma ECE Department University of New Mexico Ankur Sharma Department of ECE Indian Institute of Technology,
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Jen-Tzung Chien, Meng-Sung Wu Minimum Rank Error Language Modeling.
Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.
Financial Data mining and Tools CSCI 4333 Presentation Group 6 Date10th November 2003.
An unsupervised conditional random fields approach for clustering gene expression time series Chang-Tsun Li, Yinyin Yuan and Roland Wilson Bioinformatics,
1 A latent information function to extend domain attributes to improve the accuracy of small-data-set forecasting Reporter : Zhao-Wei Luo Che-Jung Chang,Der-Chiang.
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.
 Analysis of statistics generated by market activity such as past price and volume to come up with reasonable outcome in future using charts as a primary.
Chapter 13 Simple Linear Regression
BUSINESS MATHEMATICS & STATISTICS.
A Latent Space Approach to Dynamic Embedding of Co-occurrence Data
Exposing Digital Forgeries by Detecting Traces of Resampling Alin C
Chapter 10 Correlation and Regression
Discrete Event Simulation - 4
Lecture 2 – Monte Carlo method in finance
M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University
Presentation transcript:

A Generalized Model for Financial Time Series Representation and Prediction Author: Depei Bao Presenter: Liao Shu Acknowledgement: Some figures in this presentation are obtained from the paper

Outline of the Presentation Introduction Introduction Critical Point Model (CPM) for Financial Time Series Representation Critical Point Model (CPM) for Financial Time Series Representation Motivation and importance of using critical points Motivation and importance of using critical points The generalized CPM to represent financial time series The generalized CPM to represent financial time series Probabilistic model based on CPM for prediction Probabilistic model based on CPM for prediction Experimental Results Experimental Results Conclusion Conclusion

Introduction Flow Chart of the General Financial Time Series Prediction Method: Flow Chart of the General Financial Time Series Prediction Method: Input Data Feature Extraction Probabilistic Model Optimization Forecast Value

Introduction Main Idea of the Proposed Method: Stock movements are affected by two types of factors Main Idea of the Proposed Method: Stock movements are affected by two types of factors Gradual strength changes between the buying side and the selling side (Useful Information) Gradual strength changes between the buying side and the selling side (Useful Information) Random factors such as emergent affairs or daily operation variations (Noise) Random factors such as emergent affairs or daily operation variations (Noise) Motivation and Goal of the Proposed Method: Motivation and Goal of the Proposed Method: Using the original raw price data to do prediction can be problematic Using the original raw price data to do prediction can be problematic Remove the noise information and preserve the useful information to do the prediction Remove the noise information and preserve the useful information to do the prediction

Critical Point Model (CPM) for Financial Time Series Representation Motivation (Why critical point model?) Motivation (Why critical point model?) A fluctuant financial time series consists of a sequence of local maximal/minimal points. Some of them mirrors the information of trend reversals A fluctuant financial time series consists of a sequence of local maximal/minimal points. Some of them mirrors the information of trend reversals

Motivation and Importance of using Critical Points: Pattern Information Based on the critical points, the input financial time series can be represented in a pattern-wised manner to reflect their trends over different periods Based on the critical points, the input financial time series can be represented in a pattern-wised manner to reflect their trends over different periods

CPM Based Representation A financial time series is comprised of a sequence of critical points (local minimal/maximal) A financial time series is comprised of a sequence of critical points (local minimal/maximal) We only consider the critical points We only consider the critical points Only some of the critical points are preserved (remove those critical points which are considered as noise factors) Only some of the critical points are preserved (remove those critical points which are considered as noise factors)

Definition of Noise Defined based on two measure criterions Defined based on two measure criterions Amount of oscillation between two critical points Amount of oscillation between two critical points Duration between two critical points Duration between two critical points A small oscillation and a short duration will be regarded as noise. A small oscillation and a short duration will be regarded as noise.

Simple CPM Example Example Define a minimal time interval T (duration) and a minimal vibration percentage P (oscillation). Remove the critical points (X(i),Y(i)) and (X(i+1),Y(i+1)) if: Define a minimal time interval T (duration) and a minimal vibration percentage P (oscillation). Remove the critical points (X(i),Y(i)) and (X(i+1),Y(i+1)) if:

Drawbacks of the Simple CPM The Simple CPM is too rough, the critical points are accessed in a local range (without looking ahead) The Simple CPM is too rough, the critical points are accessed in a local range (without looking ahead) Example of an exception: Example of an exception: In this example, it is assume that AB, AD and BE don’t satisfy the removing criteria of the simple CPM, but BC, CD, DE satisfy In this example, it is assume that AB, AD and BE don’t satisfy the removing criteria of the simple CPM, but BC, CD, DE satisfy

Drawbacks of the Simple CPM Another exception case: Another exception case: In this example, BC is assumed to be satisfy the removing criteria of simple CPM In this example, BC is assumed to be satisfy the removing criteria of simple CPM

Drawbacks of the Simple CPM Root of the drawback of simple CPM: Root of the drawback of simple CPM: Only testing the distance between two successive critical points to evaluate a vibration Only testing the distance between two successive critical points to evaluate a vibration The generalized CPM (GCPM) is proposed in this paper to overcome these shortcomings The generalized CPM (GCPM) is proposed in this paper to overcome these shortcomings

The Generalized CPM The time series is processed sequentially in the unit of three points (two minimal points and one maximal point) The time series is processed sequentially in the unit of three points (two minimal points and one maximal point) Important Reminder: in GCPM, the three points in a unit are not necessary to be successive critical points Important Reminder: in GCPM, the three points in a unit are not necessary to be successive critical points

The Generalized CPM Main issues of GCPM: Main issues of GCPM: How to choose the next three-point unit to be processed How to choose the next three-point unit to be processed How to choose preserved critical points How to choose preserved critical points

Outline of the GCPM

Initialization of GCPM All the local maximal/minimal points in a raw time series are extracted to form the initial critical point series: All the local maximal/minimal points in a raw time series are extracted to form the initial critical point series:

Data Representation of GCPM After constructing the initial critical point series C, a critical point selection criteria is applied to filter out the critical points corresponding to noise. Then the original time series is approximated by linear interpolating points between a maximal point and a minimal point After constructing the initial critical point series C, a critical point selection criteria is applied to filter out the critical points corresponding to noise. Then the original time series is approximated by linear interpolating points between a maximal point and a minimal point

The Critical Point Selection Criteria of GCPM The first and the last data point in the original time series are preserved as the first and last point in C The first and the last data point in the original time series are preserved as the first and last point in C Local maximal and local minimal points in the approximated series must appear alternately Local maximal and local minimal points in the approximated series must appear alternately

The Critical Point Selection Criteria of GCPM Selection is also based on the oscillation threshold P and the duration threshold T Selection is also based on the oscillation threshold P and the duration threshold T Consider P first, there are four cases Consider P first, there are four cases Both the rise and the decline oscillations exceed P Both the rise and the decline oscillations exceed P The rise over P, but the decline below P The rise over P, but the decline below P The decline over P, but the rise below P The decline over P, but the rise below P Neither the rise nor the decline over P. Neither the rise nor the decline over P.

Four Cases Regard to Oscillation

Second Layer Checking with Duration T For the oscillation below P but the duration above T, it still holds valuable trend information For the oscillation below P but the duration above T, it still holds valuable trend information Case 2 and Case 3 pass the duration T checking will be considered as Case 1 Case 2 and Case 3 pass the duration T checking will be considered as Case 1 For Case 4, if any side pass the duration T checking, the midpoint will be removed and choose the next test unit beginning with the current third point For Case 4, if any side pass the duration T checking, the midpoint will be removed and choose the next test unit beginning with the current third point

Process for Case 1 The first two points, i, i+1 will be preserved, and then the next unit will be i+2,i+3,i+4 The first two points, i, i+1 will be preserved, and then the next unit will be i+2,i+3,i+4

Process for Case 2 Two sub-cases Two sub-cases If Y(i+3) >= Y(i+1), the next unit will be i, i+3, i+4 If Y(i+3) >= Y(i+1), the next unit will be i, i+3, i+4 Otherwise, the next unit will be i, i+1, i+4 Otherwise, the next unit will be i, i+1, i+4

Process for Case 3 Two sub-cases Two sub-cases Y(i+3)>=Y(i+1) Y(i+3)>=Y(i+1) Y(i+3)<Y(i+1) Y(i+3)<Y(i+1) The next unit will always be i+2,i+3,i+4 because Y(i+2)<=Y(i) The next unit will always be i+2,i+3,i+4 because Y(i+2)<=Y(i)

Process for Case 4 Two sub-cases Two sub-cases Y(i)<=Y(i+2): next unit will be i, i+3, i+4 Y(i)<=Y(i+2): next unit will be i, i+3, i+4 Y(i)>Y(i+2): next unit will be i+2, i+3, i+4 Y(i)>Y(i+2): next unit will be i+2, i+3, i+4

Price Pattern Matching in GCPM Two types of patterns: Two types of patterns: The point-wise patterns The point-wise patterns The trend pattern The trend pattern

Price Pattern Matching in GCPM An example of finding a constraint H & S pattern An example of finding a constraint H & S pattern

Price Pattern Matching in GCPM Numerical formulation of the constraint H & S pattern Numerical formulation of the constraint H & S pattern

Probabilistic model based on GCPM for prediction After the data smoothing and GCPM process, five common technical analysis systems including 30 technical indicators are used to represent the each turning point. After the data smoothing and GCPM process, five common technical analysis systems including 30 technical indicators are used to represent the each turning point. Price pattern system Price pattern system Trendline system Trendline system Moving average system Moving average system RSI oscillator system RSI oscillator system Stochastic SlowK-SlowD oscillator system Stochastic SlowK-SlowD oscillator system The turning points and their technical indicators are used as training examples to learn the parameters of a probabilistic model based on the Markov Network The turning points and their technical indicators are used as training examples to learn the parameters of a probabilistic model based on the Markov Network

Probabilistic model based on GCPM for prediction The Markov Network The Markov Network Y = {true,false} represent whether a critical point is the real turning point Y = {true,false} represent whether a critical point is the real turning point X = {X1,X2,…,Xn}, Xi = {true,false} is a vector with Xi represents the i-th technical indicator and TRUE for the occurrence of the signal for the current critical point X = {X1,X2,…,Xn}, Xi = {true,false} is a vector with Xi represents the i-th technical indicator and TRUE for the occurrence of the signal for the current critical point

Probabilistic model based on GCPM for prediction The Markov Network Can be Converted to: The Markov Network Can be Converted to: For each indicator, if the corresponding rule Xi -> Y (~Xi V Y) is true, then fi(xi,y) = 1, otherwise fi(xi,y) = 0. The to-be-estimated parameter wi corresponds to each rule. For each indicator, if the corresponding rule Xi -> Y (~Xi V Y) is true, then fi(xi,y) = 1, otherwise fi(xi,y) = 0. The to-be-estimated parameter wi corresponds to each rule.

Optimization of Parameters The parameter wi of the probabilistic model is learned by optimizing the conditional log-likelihood (CLL): The parameter wi of the probabilistic model is learned by optimizing the conditional log-likelihood (CLL): n is the number of training samples. n is the number of training samples. After obtaining the optimal parameters, the inference step is calculated by using the Gibbs sampling method (a special Markov Chain Monte Carlo algorithm) After obtaining the optimal parameters, the inference step is calculated by using the Gibbs sampling method (a special Markov Chain Monte Carlo algorithm)

Experimental Results The approximation accuracy of GCPM, the normalized error (NE) is adopted as the metric The approximation accuracy of GCPM, the normalized error (NE) is adopted as the metric NE for approximating the prices of IBM NE for approximating the prices of IBM

Experimental Results Graphical comparison between the simple CPM and the proposed GCPM to model the IBM price series Graphical comparison between the simple CPM and the proposed GCPM to model the IBM price series

Experimental Results Test on Stock Trading: Test on Stock Trading: A simple trading rule: if the current reversal is from an uptrend to a downtrend over a certain probability estimated by the proposed model, then sell, and vice versa. With initial fund $1000 A simple trading rule: if the current reversal is from an uptrend to a downtrend over a certain probability estimated by the proposed model, then sell, and vice versa. With initial fund $1000 Trading log of ALCOA INC for 4 years Trading log of ALCOA INC for 4 years

Experimental Results Test the system on the CBOT Soybeans future prices from 1/5/1970 to 12/21/2006 Test the system on the CBOT Soybeans future prices from 1/5/1970 to 12/21/2006

Experimental Results The system is also evaluated for the simulated trades on 454 stocks of the S&P 500c. Then stocks are randomly picked and examine their profits on three periods The system is also evaluated for the simulated trades on 454 stocks of the S&P 500c. Then stocks are randomly picked and examine their profits on three periods

Conclusion This paper proposed a new financial time series representation method for prediction based on the generalized critical point model (GCPM) This paper proposed a new financial time series representation method for prediction based on the generalized critical point model (GCPM) The GCPM based representation is general and robust The GCPM based representation is general and robust Experimental results demonstrated that even in a period where a stock has a significant downtrend, the proposed method can still make profits. Experimental results demonstrated that even in a period where a stock has a significant downtrend, the proposed method can still make profits.

End of My Presentation Thank you! Thank you!