CHAPTER 4 PROBABILITY THEORY SEARCH FOR GAMES
Representing Knowledge
Uncertainty
Probabilities Probabilistic approach.- With no other information, A60 will get me there on time with probability P(A60) = 0.6 Probabilities change with new evidence: - P(A60 | 5 am) = P(A60 | 9 am) = P(A60 | accident report, 5 am) = P(A60 | accident report) = 0.1 I.e., observing evidence causes beliefs to be updated
Probabilistic Models
What Are Probabilities? Objectivist / frequentist answer: - Averages over repeated experiments - E.g. estimating P(rain) from historical observation - Assertion about future experiments (in the limit) - New evidence changes the reference class - Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: - Degrees of belief about unobserved variables - E.g., an agent’s belief that it’s raining, given the temp - Often estimate probabilities from past experience - New evidence updates beliefs Unobserved variables still have fixed assignments (we just don’t know what they are)
Distributions on Random Vars
Examples
Marginalization
Conditional Probabilities
Inference by Enumeration
The Chain Rule I
Lewis Carroll's Pillow Problem
Independence
Example: Independence N fair, independent coins:
Conditional Independence
The Chain Rule II
The Chain Rule III
Expectations
Expectations
Estimation
Estimation
Game Playing State-of-the-Art Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 443,748,401,247 positions. Exact solution imminent. Chess: Deep Blue defeated human world champion Gary Kasparov in a six-game match in Deep Blue examined 200 million positions per second, used very sophisticated evaluation and undisclosed methods for extending some lines of search up to 40 ply. Othello: human champions refuse to compete against computers, which are too good. Go: human champions refuse to compete against computers, which are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.
Game Playing Axes: –Deterministic or stochastic –One, two or more players –Perfect information (can you see the state) Want algorithms for calculating a strategy (policy) which recommends a move in each state
Deterministic Single-Player?
Approximating Node Value
Stochastic Single-Player
Deterministic Two-Player
Tic-tac-toe Game Tree
Minimax Example
Minimax Search
Stochastic Two-Player
Evaluation Functions
Function Approximation