# Predictive State Representations Hui Li July 7, 2006.

## Presentation on theme: "Predictive State Representations Hui Li July 7, 2006."— Presentation transcript:

Predictive State Representations Hui Li July 7, 2006

Outline What are the advantages of predictive state representation What’s predictive state representation (PSR) How to learn PSR model Conclusions

What are the advantages of PSR PSR are expressed entirely on observable quantities PSR avoids the problems of local minima and saddle points in learning the model of POMDP PSR attain generality and compactness at least equal to POMDP

What are predictive state representations (1/9) Two notations in PSR History ( h ) History is the sequence of action-observation ( ao ) pair that the agent has already experienced, beginning at the first time step Test ( t ) Test is a sequence of ao pair that begins immediately after a history

… a1a1 o1o1 a2a2 o2o2 a3a3 o3o3 ajaj ojoj a1a1 o1o1 … a2a2 o2o2 akak okok History Test Prediction of a test p(t|h) What are predictive state representations (2/9)

System-dynamics matrix D What are predictive state representations (3/9)

Order of all possible tests in D hihi Properties of the predictions in each row of D hihi What are predictive state representations (4/9)

Relation between PSR and POMDP Belief state is updated according to Bayes rule Constructing D from a POMDP What are predictive state representations (5/9)

What are predictive state representations (6/9)

Core tests Q T The tests corresponding to the k linearly independent columns are called core tests. Core histories Q h The histories corresponding to the k linearly independent rows are called core histories. Since the rank of D  k, there must exit at most k linearly independent columns or rows in D. What are predictive state representations (7/9)

What are predictive state representations (8/9)

Linear PSR model D(Q) is a linear sufficient statistic of the histories since all the columns of D are a linear combination of the columns in D(Q). PSR State update Definition What are predictive state representations (9/9)

How to learn PSR model (1/6) Two subproblems in learning PSR model Discovery: find the core tests Q T which predictions constitutes state (sufficient statistic) Learning: learn the parameters m aot that define the system dynamics.

The set of tests and histories corresponding to a set of linearly independent columns and rows of any submatrix of D are subsets of core-tests and core-histories respectively. How to learn PSR model (2/6) Infinite MatrixFinite, small matrix

How to learn PSR model (3/6) Analytical Discovery and Learning Algorithm (ADL) 1. Assumption: the exact D is obtained 2.Analytical discovery algorithm (AD) 3.Analytical learning algorithm (AL)

How to learn PSR model (4/6) 1.Analytical discovery algorithm (AD) All tests up to length 1 All histories up to length 1 Linearly independent Extend one step T1T1 H1H1... Until converge

2. Analytical learning algorithm (AD) How to learn PSR model (5/6) Since Then

How to learn PSR model (6/6) Estimate the system-dynamic matrix D

Conclusions New dynamical systems – predictive state representations (PSR) is introduced which is grounded in actions and observations. An algorithm is introduced – analytical discovery and learning (ADL) to learn the PSR model

References 1.James, M. R., & Singh, S. (2004). Learning and discovery of predictive state representations in dynamical systems with reset. Proceedings of the 21st International Conference on Machine Learning (ICML) (pp. 719–726). 2.Littman, M., Sutton, R. S., & Singh, S. (2002). Predictive representations of state. Advances in Neural Information Processing Systems 14 (NIPS) (pp. 1555–1561). MIT Press. 3.McCracken, P., & Bowling, M. (2006). Online learning of predictive state representations. Advances in Neural Information Processing Systems 18 (NIPS). MIT Press. To appear. 4.Singh, S., James, M. R., & Rudary, M. R. (2004). Predictive state representations: A new theory for modeling dynamical systems. Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (UAI) (pp. 512–519). 5.Singh, S., Littman, M., Jong, N., Pardoe, D., & Stone, P.(2003). Learning predictive state representations. Proceedings of the Twentieth International Conference on Machine Learning (ICML) (pp. 712–719). 6.Wiewiora, E. (2005). Learning predictive representations from a history. Proceedings of the 22nd International Conference on Machine Learning (ICML) (pp. 969–976). 7.Wolfe, B., James, M. R., & Singh, S. (2005). Learning predictive state representations in dynamical systems without reset. Proceedings of the 22nd International Conference on Machine Learning (ICML) (pp. 985–992). 8. Bowling, M., McCracken, P., James, M., Neufeld J., & Wilkinson, D. (2006). Learning predictive state representations using non-blind polices. ICML 2006