Students: Lior Kupfer Pavel Lifshits Supervisor: Andrey Bernstein Advisor: Prof. Nahum Shimkin Technion – Israel Institute of Technology Faculty of Electrical.

Slides:



Advertisements
Similar presentations
Study on the Romanian Capital Market Efficiency A Filter Rule Technique Application Student Robu Anca-Maria Academy of Economic Studies Bucharest Doctoral.
Advertisements

An Example of Quant’s Task in Croatian Banking Industry
Reinforcement Learning
Learning to Trade via Direct Reinforcement
Chapter 11 Optimal Portfolio Choice
6/30/00UAI Regret Minimization in Stochastic Games Shie Mannor and Nahum Shimkin Technion, Israel Institute of Technology Dept. of Electrical Engineering.
Financial Risk Management Framework - Cash Flow at Risk
Chapter 20 Hedge Funds Hedge Funds vs Mutual Funds Public info on portfolio composition Unlimited Must adhere to prospectus, limited short selling.
10/29/01Reinforcement Learning in Games 1 Colin Cherry Oct 29/01.
CAS 1999 Dynamic Financial Analysis Seminar Chicago, Illinois July 19, 1999 Calibrating Stochastic Models for DFA John M. Mulvey - Princeton University.
Reinforcement Learning
Chapter 19 Exchange Rate Determination II: Nominal Exchange Rates and Currency Crises.
Cooperative Q-Learning Lars Blackmore and Steve Block Expertness Based Cooperative Q-learning Ahmadabadi, M.N.; Asadpour, M IEEE Transactions on Systems,
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
1 Kunstmatige Intelligentie / RuG KI Reinforcement Learning Johan Everts.
Risk Aversion and Capital Allocation to Risky Assets
Behavioral Forecasting MS&E 444: Final Presentation Rachit Prasad, Sudeep Tandon, Puneet Chhabra, Harshit Singh Stanford University.
Mr. Perminous KAHOME, University of Nairobi, Nairobi, Kenya. Dr. Elisha T.O. OPIYO, SCI, University of Nairobi, Nairobi, Kenya. Prof. William OKELLO-ODONGO,
The Greek Letters Chapter The Goals of Chapter 17.
CHEN RONG ZHANG WEN JUN.  Introduction and Features  Price model including Bayesian update  Optimal trading strategies  Coding  Difficulties and.
1 ASSET ALLOCATION. 2 With Riskless Asset 3 Mean Variance Relative to a Benchmark.
Expected Utility, Mean-Variance and Risk Aversion Lecture VII.
NEURAL NETWORKS FOR TECHNICAL ANALYSIS: A STUDY ON KLCI 授課教師:楊婉秀 報告人:李宗霖.
Chapter1 Software Economics.
CHAPTER SIXTEEN MANAGING THE EQUITY PORTFOLIO © 2001 South-Western College Publishing.
MANAGING THE EQUITY PORTFOLIO CHAPTER EIGHTEEN Practical Investment Management Robert A. Strong.
Presented by Ori Gil Supervisor : Gal Zahavi Control and Robotics Laboratory Winter 2011.
1 Dr. Itamar Arel College of Engineering Electrical Engineering & Computer Science Department The University of Tennessee Fall 2009 August 24, 2009 ECE-517:
REINFORCEMENT LEARNING LEARNING TO PERFORM BEST ACTIONS BY REWARDS Tayfun Gürel.
Chapter 11 LEARNING FROM DATA. Chapter 11: Learning From Data Outline  The “Learning” Concept  Data Visualization  Neural Networks The Basics Supervised.
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
Introduction Many decision making problems in real life
IE 585 Introduction to Neural Networks. 2 Modeling Continuum Unarticulated Wisdom Articulated Qualitative Models Theoretic (First Principles) Models Empirical.
1 Portfolio Optimization Problem for Stock Portfolio Construction Student : Lee, Dah-Sheng Professor: Lee, Hahn-Ming Date: 9 July 2004.
Natural Actor-Critic Authors: Jan Peters and Stefan Schaal Neurocomputing, 2008 Cognitive robotics 2008/2009 Wouter Klijn.
Online Financial Intermediation. Types of Intermediaries Brokers –Match buyers and sellers Retailers –Buy products from sellers and resell to buyers Transformers.
Support Vector Machine With Adaptive Parameters in Financial Time Series Forecasting by L. J. Cao and Francis E. H. Tay IEEE Transactions On Neural Networks,
Portfolio Construction Strategies Using Cointegration M.Sc. Student: IONESCU GABRIEL Supervisor: Professor MOISA ALTAR BUCHAREST, JUNE 2002 ACADEMY OF.
Balancing Exploration and Exploitation Ratio in Reinforcement Learning Ozkan Ozcan (1stLT/ TuAF)
ARTIFICIAL NEURAL NETWORKS. Overview EdGeneral concepts Areej:Learning and Training Wesley:Limitations and optimization of ANNs Cora:Applications and.
Stock Price Prediction Using Reinforcement Learning
Reinforcement Learning 主講人:虞台文 Content Introduction Main Elements Markov Decision Process (MDP) Value Functions.
Overview of Machine Learning RPI Robotics Lab Spring 2011 Kane Hadley.
Fuzzy Reinforcement Learning Agents By Ritesh Kanetkar Systems and Industrial Engineering Lab Presentation May 23, 2003.
Akram Bitar and Larry Manevitz Department of Computer Science
Investment Performance Measurement, Risk Tolerance and Optimal Portfolio Choice Marek Musiela, BNP Paribas, London.
Portfolio Management Unit – 1 Session No.3 Topic: Portfolio Management Process Unit – 1 Session No.3 Topic: Portfolio Management Process.
Cooperative Q-Learning Lars Blackmore and Steve Block Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents Tan, M Proceedings of the.
CUHK Learning-Based Power Management for Multi-Core Processors YE Rong Nov 15, 2011.
Reinforcement Learning with Laser Cats! Marshall Wang Maria Jahja DTR Group Meeting October 5, 2015.
Data Mining and Decision Support
1 ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 21: Dynamic Multi-Criteria RL problems Dr. Itamar Arel College of Engineering Department.
Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10
Learning for Physically Diverse Robot Teams Robot Teams - Chapter 7 CS8803 Autonomous Multi-Robot Systems 10/3/02.
Reinforcement Learning for Mapping Instructions to Actions S.R.K. Branavan, Harr Chen, Luke S. Zettlemoyer, Regina Barzilay Computer Science and Artificial.
Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
March-14 Central Bank of Egypt 1 Strategic Asset Allocation.
An Automated Trading System using Recurrent Reinforcement Learning
Risk Budgeting.
An Investigation of Market Dynamics and Wealth Distributions
Ch8 Time Series Modeling
Reinforcement learning (Chapter 21)
Chapter 2: Evaluative Feedback
Double Dueling Agent for Dialogue Policy Learning
Chapter 2: Evaluative Feedback
Akram Bitar and Larry Manevitz Department of Computer Science
The New Risk Management: The Good, the Bad, and the Ugly
Presentation transcript:

Students: Lior Kupfer Pavel Lifshits Supervisor: Andrey Bernstein Advisor: Prof. Nahum Shimkin Technion – Israel Institute of Technology Faculty of Electrical Engineering Control and Robotics Laboratory

 Introduction  Notations  The System  The Learning Algorithm  Project Goals  Results ◦ Artificial Time Series (the AR case) ◦ Real Foreign Exchange / Stock Data  Conclusions  Future work Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 2 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.

 Using Machine Learning methods for trading ◦ One relatively new approach to financial trading ◦ Using learning algorithms to predict the rise and fall of asset prices before they occur ◦ An optimal trader would buy an asset before the price rises, and sell the asset before its value declines Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 3 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 4 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  Trading technique ◦ An asset trader was implemented using recurrent reinforcement learning (RRL) suggest by Moody and Saffell (2001) ◦ It is a gradient ascent algorithm which attempts to maximize a utility function known as Sharpe’s ratio. ◦ We denote a parameter vector which completely defines the actions of the trader. ◦ By choosing an optimal parameter for the trader, we attempt to take advantage of asset price changes.

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 5 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  Due to transactions costs which include ◦ Commissions ◦ Bid/Ask spreads ◦ Price slippage ◦ Market impact  Our constrains ◦ Can’t make arbitrarily frequent trade ◦ Can’t make large changes in portfolio composition.  Model assumptions ◦ Fixed position size ◦ Single security

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 6 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  – Fixed quantities of security ◦ The price series is ◦ - The corresponding price changes  - Out position in each time step ◦  - System return in each time step ◦ 

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 7 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  - Add i tive profit accumulated over T time periods ◦  - Performance criterion ◦ ◦ Is the marginal increase in the performance

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 8 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  ◦ - Parameters vector (which we attempt to learn) ◦ - Information available at time t (in our case - the price changes) ◦ - Stochastic extension (noise) which level can be varied to control “exploration vs. exploitation”.  Our system is a single layer recurrent neural network:  Formally: ◦

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 9 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 10 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  We use reinforcement learning (RL) to adjust the parameters of the system to maximize our performance criteria of choice  RL – an alternative between supervised & unsupervised learning  RL Framework: ◦ Agent  Environment ◦ Reward ◦ Expected Return ◦ Policy Learning  RL modus operandi ◦ Agent perceives the state of the environment s t and chooses an action a t. It subsequently observes the new state of the environment s t+1 and receives a reward r t. ◦ Aim : Learn a policy π (mapping from states to actions), which optimizes the expected return

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 11 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  RL approaches ◦ Direct RL  In this approach, the policy is represented directly. The reward function (immediate feedback) is used to adjust the policy on the fly.  e.g. policy search ◦ Value function RL  In this approach,values are assigned to each state (or state‐action pair). Values correspond to estimates of future expected returns, or in other words, to the long‐term desirability of states. These values help guide the agent towards the optimal policy  e.g. TD-Learning, Q-Learning. ◦ Actor-Critic  The model is split into two parts: the critic, which maintains the state value estimate V, and the actor, which is responsible for choosing the appropriate actions at each state.

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 12 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  In RRL we learn the policy by gradient ascent in the performance function  Performance function can be ◦ Profits ◦ Sharpe’s ratio ◦ Sterling ratio ◦ Double deviation  Moody suggests an additive and differentiable approximation for Sharpe’s ratio – the differential Sharpe’s ratio

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 13 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  Now we develop ◦ Note: 

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 14 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  Investigate Reinforcement Learning by policy gradient  Implement an automated trading system which learns it’s trading strategy by Recurrent Reinforcement Learning algorithm  Analyze the system’s results & structure  Suggest and examine improvement methods

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 15 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory. DataSetGoals Artificial Time Series Show the system can learn Analyze parameters effect Validate various model approximations Real Foreign Exchange EUR/USD Data Show the system can learn a profitable strategy on real data Search for possible improvements

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 16 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  The challenges we face ◦ Model parameters: ◦ If and how to normalize the learned weights? ◦ How to normalize the input?  The averages changes over time (non stationary) – we assume that the change is slower than “how far back we look”

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 17 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  r t – the return series are generated by AR(p) process  We analyze the effect of ◦ transaction costs ◦ quantization levels ◦ number of autoregressive inputs On ◦ Sharpe’s ratio ◦ trading frequency ◦ Profits  Effect of initial conditions

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 18 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 19 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 20 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 21 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 22 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory. % Long positions % Neutral positions % Short positions 2 Positions trader 51.2%0%48.8% 3 Positions trader 40%25.2%34.8% 3 Positions conservative trader 31.2%48.8%20%

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 23 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  The prices series are of US Dollar vs. Euro exchange rate between 21/05/2007 until 15/01/2010 on 15 minutes data points  We compare our trader with ◦ Random strategy of Uniform distribution ◦ Buy and Hold strategy of Euro against US Dollar.

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 24 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  No commissions

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 25 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  With commissions (0.1%)

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 26 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  RRL performs better than the random strategy  Positive Sharpe Ratios achieved in most cases  RRL seems to struggle during volatile periods  Large variance is a major cause for concern  Can’t unravel complex relationships in the data  Changes in market condition lead to waste of all the system’s learning during the training phase (but most learning systems suffer from this).

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 27 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  When trading real data - the transaction cost is a killer  Normalizing the input series can be a real challenge ◦ The input series are non-stationary ◦ We assume the rate of change of average number of AR inputs to the system  Normalizing the weights – heuristically ◦ Threshold method leads to best results on both artificial & real data  Redundancy when input series are ARMA processes  Large training sessions under constant market conditions lead to overfitting

Outline Introduction Notations The system The Learning Algorithm Project Goals Results Artificial Series Real Forex Data Conclusions Future work 28 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  Wrapping the system with risk management layer (e.g. Stop-Loss, retraining trigger, shut down the system under anomalous behavior)  Dynamical adjustment of external parameters (such as learning-rate)  Working with more than one security  Working with variable size positions  Working with coordination with another expert system (based on other algorithms)

29 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  We would like to thank our project supervisor Andrey Bernstein for the guidance, Prof. Nahum Shimkin for advising us and allowing us to pursue a research project of our interest and sharing his experience with us.  Additionally we would like to thank Prof. Ron Meir & Prof. Neri Merhav for their time spent consulting us.  Special warm thanks to Gabriel Molina from Stanford university and Tikesh Ramtohul from University of Basel for their priceless help.

30 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.

31 L. Kupfer & P.Lifshits : “An automated trading system based on Recurrent Reinforcement Learning”, Technion - Israel Institute of Technology, Faculty of Electrical Engineering, Control and Robotics Laboratory.  [1] J Moody, M Saffell, Learning to Trade via Direct Reinforcement, IEEE Transactions on Neural Networks,Vol 12, No 4, July 2001  [2] Carl Gold, FX Trading via Recurrent Reinforcement Learning, CIFE, Hong Kong, 2003  [3] M.A.H. Dempster, V. Leemans, An Automated FX trading system using adaptive reinforcement learning, Expert Systems with Applications 30, pp , 2006