Lecture 3: Environs and Algorithms

Lecture 3: Environs and Algorithms
CSCI 4310 Lecture 3: Environs and Algorithms

Task Environments Fully Observable Partially Observable D
Sensors detect all aspects of the environment relevant to choosing an action Partially Observable D Some unknowns

Task Environments Deterministic Stochastic (Non-Deterministic) D
Next state of the environment is completely determined by the current state and agent action Stochastic (Non-Deterministic) D Identical world state and sensor information may result in different agent actions each time Strategic Deterministic except for actions of other agents

Task Environments Episodic Sequential D
Agent perceives and acts in a distinct episode Next episode does not depend on previous Part-picking robot Sequential D Current decision could affect all future decisions More difficult – must look ahead

Task Environments Dynamic D Static
Environment can change while algorithm is deciding course of action Changing environment can lead to algorithm inaction Static Environment does not change

Task Environments Discrete Continuous D
A finite number of states Chess environment Continuous D A range of continuous values Driving sensors Can be applied to environment state, sensors, actions Article in relation to games

Task Environments Single agent Multi-agent D Roomba Competitive
Zero-sum games: Chess Cooperative Co-Op game play Working together maximizes individual performance measures

Task Environments If you have a partially observable, stochastic, sequential, dynamic, continuous, multi-agent task Just surrender However, this is the real world.

Broad Algorithmic Categories

Alternatives to Optimization
Heuristics Finding “good” methods to apply to problems Algorithm Design Humans are remarkably adept TSP

Greedy Algorithms Used for optimization problems
Make the most promising decision at any given time Never reconsider or reverse your decision Does this always yield the optimal solution?

Greedy Algorithms What problems does this work well on?
Finding directions in a GPS mapping service

Divide and Conquer Algorithms
Top down: Start with the entire problem Decompose a problem into a number of smaller instances of the same problem Merge the solutions to obtain the solution to the original instance Have a ‘base case’

Divide and Conquer Algorithms
Ex: Recursion Ex: Merge Sort

Dynamic Programming Algorithms
Bottom up: Start by obtaining solutions to the smallest sub-instances Combine these solutions to get solutions to larger instances Benefit: use tables to store results calculated so far

Dynamic Programming Algorithms
Avoids the duplication that can hurt the performance of divide and conquer strategies You will often re-compute the same solution.

Backtracking Algorithms
“Just start solving and hope for the best” Usually involves a depth first search Analogous to how humans would find their way around research park Just start walking and hope I find the lab Drop a crumb at turns If you dead-end, return to most recently dropped crumb and try a different direction

Backtracking Algorithms
Ex: 8 Queens problem

Branch-and-Bound Algorithms
Similar to Backtracking Do not keep trying a path that you already know is worse than the best answer Pruning Can slow things down

Branch-and-Bound Algorithms
Ex: Many game tree searches Estimate the upper and lower bound of all choices in a certain tree branch If maximizing the function value If branch Aupper < Blower prune branch A

Lecture 3: Environs and Algorithms

Similar presentations

Presentation on theme: "Lecture 3: Environs and Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture 3: Environs and Algorithms

Similar presentations

Presentation on theme: "Lecture 3: Environs and Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback