Stock Price Prediction Using Reinforcement Learning 이 재 원
Introduction Analytical methods Technical analysis Fundmental analysis EMH (efficient market hypothesis) Traditional time series forecasting Chaos theory Computer techniques Neural network Fuzzy logic / Expert system
Adopt reinforcement learning “Economic history is a never-ending series of episodes based on falsehoods and lies, not truths. It represents the path to big money. The object is to recognize the trend whose premise is false, ride that trend, and step off before it is discredited.” - George Soros – The proposed method Adopt reinforcement learning Suitable for representing delayed rewards as well as immediate rewards
Reinforcement Learning Agent-environment interaction Agent Action at State st Reward rt rt+1 Environment st+1
evaluation V improvement Value function Generalized policy iteration VV greedy(V ) V
TD Algorithms Learn from raw experience without a model Bootstrap update in part on an existing estimate suitable for continuous tasks TD(0) the simplest TD algorithm
Stock Price Changes in TD View State vector Raw daily data (open price, close price, ...) Technical indicators Disparities Moving averages Stochastic oscillator etc.
Reward Relative rate of change in close price The values of states can be calculated from the rewards using discounting factor (0 < < 1)
e.g., The value of stock A at time step 0 is greater than that of stock B
Function Approximation by Neural Network Parameter vector Vector of connection weights of the net Gradient descent
Experimental result
Future works Predictability Rule-based approach/other learning models Policy optimization optimal profit ratio optimal stop loss(risk management) optimal holding period Asset allocation Other investment opportunities Foreign exchange Futures/options