Introduction to Reinforcement Learning Hiren Adesara Prof: Dr. Gittens
Sources for this presentation Lecture videos of – Mr. Satinder Singh, University of Michigan. – Douglas Aberdeen, Australian National University. From Book : Introduction to Reinforcement Learning by Sutton and Barto ( ebook/the-book.html)
Observation-Action-Response. O 1 a 1 r 1 o 2 a 2 r 2 o 3 a 3 r 3 Agent chooses action so as to maximize expected cumulative reward over time. Observations can be vectors or other structures. Actions are multi-dimensional. Rewards are scalar. (known or unknown). Agents have partial knowledge about environment. Another View of RL
Demo..
RL and Machine Learning Supervised Learning – Learning approach to regression and classification. – Learning from example and learning from teacher. Unsupervised learning – Learning approaches to dimensionality reduction, density estimation and recording data based on some principles. Reinforcement Learning – Learning approaches to sequential decision making. – Learning from critics, learning from delayed reward.
Key ideas of RL Markov Decision Process(MDP). Temporal Differences( updating a guess on the basis of the previous guess). Functional approximation.
Markov Decision Process
N
Temporal Differences
Questions ????