Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Towards a Testbed for Evaluating Learning Systems in Gaming Simulators 21 March 2004 Symposium on Reasoning and Learning Outline: 1.Objective 2.Specification.

Similar presentations


Presentation on theme: "1 Towards a Testbed for Evaluating Learning Systems in Gaming Simulators 21 March 2004 Symposium on Reasoning and Learning Outline: 1.Objective 2.Specification."— Presentation transcript:

1 1 Towards a Testbed for Evaluating Learning Systems in Gaming Simulators 21 March 2004 Symposium on Reasoning and Learning Outline: 1.Objective 2.Specification 3.Intended functionality 4.Example of use 5.Status and Goals Outline: 1.Objective 2.Specification 3.Intended functionality 4.Example of use 5.Status and Goals David W. Aha Intelligent Decision Aids Group Naval Research Laboratory, Code 5515 Washington, DC home.earthlink.net/~dwaha aha@aic.nrl.navy.mil (working with Matt Molineaux) David W. Aha Intelligent Decision Aids Group Naval Research Laboratory, Code 5515 Washington, DC home.earthlink.net/~dwaha aha@aic.nrl.navy.mil (working with Matt Molineaux) TIELT

2 2 Objective & Expected Benefits Objective Support the machine learning community by providing an API for a set of gaming engines, the ability to select a wide variety of learning and performance tasks, and an editor for specifying and conducting an evaluation methodology. Benefits 1.Reduce costs (time, $) to create these integrations Costs are often prohibitive, encouraging isolated studies 2.Encourage research on learning in cognitive systems: Embedded (e.g., process aware) Rapid (i.e., few trials) Knowledge-intensive Enduring 3.Support analysis of alternative learning systems for a given task 1.Reduce costs (time, $) to create these integrations Costs are often prohibitive, encouraging isolated studies 2.Encourage research on learning in cognitive systems: Embedded (e.g., process aware) Rapid (i.e., few trials) Knowledge-intensive Enduring 3.Support analysis of alternative learning systems for a given task

3 3 DARPAs Cognitive Systems Thrust IPTO: Information Processing Technology Office Director: Ron Brachman Assistant Director: Barbara Yoon History: IPTO has many impressive contributions e.g., time-sharing, interactive computing, Internet Long-term goal: human-computer symbiosis IPTO: Information Processing Technology Office Director: Ron Brachman Assistant Director: Barbara Yoon History: IPTO has many impressive contributions e.g., time-sharing, interactive computing, Internet Long-term goal: human-computer symbiosis Current goal: Develop cognitive systems that know what theyre doing: –can reason, using substantial amounts of appropriately represented knowledge –can learn from its experience to perform better tomorrow than it did today –can explain itself and be told what to do –can be aware of its own capabilities and reflect on its own behavior –can respond robustly to surprise Current goal: Develop cognitive systems that know what theyre doing: –can reason, using substantial amounts of appropriately represented knowledge –can learn from its experience to perform better tomorrow than it did today –can explain itself and be told what to do –can be aware of its own capabilities and reflect on its own behavior –can respond robustly to surprise

4 4 IPTOs View of a Cognitive Agent External Environment Communication (language, gesture, image) Prediction, planning Deliberative Processes Reflective Processes Reactive Processes Perception Action STM SensorsEffectors Other reasoning LTM (KB) Concepts Sentences Cognitive Agent Affect Attention Learning

5 5 Learning Foci in Cognitive Systems (Langley & Laird, 2002) Many opportunities for learning

6 6 Typical Current Practice Comparatively few machine learning (ML) efforts on cognitive systems: Isolated studies of learning algorithms Single-step, non-interactive tasks Knowledge-poor learning contexts Comparatively few machine learning (ML) efforts on cognitive systems: Isolated studies of learning algorithms Single-step, non-interactive tasks Knowledge-poor learning contexts Few of Todays Cognitive Systems Support Realistic Learning Capabilities Cognitive System ML System Database

7 7 Wanted: A new Interface (thanks to W. Cohen) ML System Database ML System Interface Database Curmudgeons Viewpoint: This might encourage research on more challenging problems But dont count on it Curmudgeons Viewpoint: This might encourage research on more challenging problems But dont count on it (e.g., UCI Repository) Cognitive System ML System Interface Cognitive System ML System Cognitive System ML System Cognitive System ML System (e.g., TIELT?)

8 8 Your Potential Uses of TIELT (Hastily Considered…and Reaching) 1.Randy Jones: Smart way to update rule conditions Use: Updating game models tasks 2.Doug Pearson: Changing conditions on operators Use: Controlling game agents 3.Prasad Tedapalli: Learning hierarchies in the game model Use: Active development of a game models task hierarchy 4.Jim Blythe: Knowledge acquisition Use: Acquiring game model constraints 5.Gheorghe Tecuci: Mixed-initiative learning for knowledge acquisition Use: Active learning of task models, etc. 6.Karen Myers: Incorporating guidance from humans for agent control Use: Learning agent controls (assuming players can provide direct feedback) 7.Barney Pell: Learning to play any of a category of games given their rules Use: Hmm…agent control, if collaborating with a game model-updating system 8.Afzal Upal: Updating plan quality Use: Induce task-specific control rules 9.Susan Epstein: Learning to solve (large) CSPs Use: Reasoning with game models constraints 10.Frank Ritter: Recognition tasks (e.g., for strategies?) Use: Learning opponent strategies 11.Dan Roth: Using multiple classifiers to solve problems Use: Set of (coordinated) learning systems for problem solving 12.Ken Forbus: Analogical reasoning and companion cognitive systems Use: Qualitative representation for game model, predicting human/agent intentions 13.Daniel Borrajo: Learning control knowledge for planning Use: Incremental learning for agent control tasks 14.Niels Taatgen: Learning for real-time tasks Use: Agent control, several RTS applications

9 9 Outline (cont) 1.Objective 2.Specification of a testbed Select a category of cognitive systems Category-specific challenges 3.Intended functionality 4.Example of use 5.Status of project 6.Goals for future work 1.Objective 2.Specification of a testbed Select a category of cognitive systems Category-specific challenges 3.Intended functionality 4.Example of use 5.Status of project 6.Goals for future work

10 10 ML System Cognitive System Effects State Decision Message passing Achieve a goal(s) Create a plan Significant Temporal, qualitative, … Plan execution measures Interface Comparison (e.g., for a supervised learning system) (e.g., for a cognitive system involving planning) Characteristics 1.Performance API Input Output 2.Learning API Input Output 3.Integration 4.Performance task 5.Learning task 6.Domain knowledge Reasoning 7.Evaluation methodology - Common data format Matches input format - Data input module Classification Set wts., create tree, etc. - Acc., ROC curves, etc. ML System Database

11 11 What type of Cognitive System? Desiderata: Candidate: Interactive Gaming Simulators 1.Available implementations Inexpensive to acquire and run 2.Pushes ML research boundaries Challenging embedded learning tasks 3.Significant interest/excitement Military, industry, academia, funding 1.Available implementations Inexpensive to acquire and run 2.Pushes ML research boundaries Challenging embedded learning tasks 3.Significant interest/excitement Military, industry, academia, funding

12 12 Gaming Genres (Laird & van Lent, 2001) Control units and strategic enemy (i.e., other coach), commentator Act as coach and a key player Madden NFL Football Team Sports Control enemy1 st vs. 3 rd personIndividual competitionMany (e.g., driving games) Individual Sports Control unit goals and goal- achievement strategies Control a simulated world & its units SimCity, The Sims God Control all units and strategic enemies Gods eye view, controls many units (e.g., tactical warfare) Age of Empires, Warcraft, Civilization Strategy Control supporting characters Linear vs. dynamic scripting Player solves puzzles, interacting w/ others Kings Quest, Blade Runner Adventure Control enemies, partners, and supporting characters Solo vs. (massively) multi-player Be a characterDiabloRole-Playing Control enemies1 st vs. 3 rd person, solo vs team play Control a characterQuake, UnrealAction AI RolesSub-GenresDescriptionExampleGenre Unfortunately,…reaction time and aiming skill are the most important factors in success in a first-person shooter game. Deep reasoning about tactics and strategy dont end up playing a big role as might be expected. (van Lent et al., 2004)

13 13 Real-Time Strategy (RTS) Games (Buro & Furtak, 2003) Fundamental AI research problems 1.Adversarial real-time planning Motivates need for abstractions of world state 2.Decision making under uncertainty e.g., opponent intentions 3.Opponent modeling, learning One of the biggest shortcomings of current RTS game AI systems is their inability to learn quickly…Current ML approaches…are inadequate in this area. 4.Spatial and temporal reasoning 5.Resource management 6.Collaboration 7.Pathfinding 1.Adversarial real-time planning Motivates need for abstractions of world state 2.Decision making under uncertainty e.g., opponent intentions 3.Opponent modeling, learning One of the biggest shortcomings of current RTS game AI systems is their inability to learn quickly…Current ML approaches…are inadequate in this area. 4.Spatial and temporal reasoning 5.Resource management 6.Collaboration 7.Pathfinding

14 14 Military: Learning in Simulators for Computer Generated Forces (CGF) Purpose: Training (present) & planning (future) Simulators: JWARS, OneSAF, Full Spectrum Command, etc. Target: Control strategic opponent or own units Simulators: JWARS, OneSAF, Full Spectrum Command, etc. Target: Control strategic opponent or own units Evidence of commitment: Some Claims Learning is an essential ability of intelligent systems (NRC, 1998) To realize the full benefit of a human behavior model within an intelligent simulator,…the model should incorporate learning (Hunter et al., CCGBR00) Successful employment of human behavior models…requires that [they] possess the ability to integrate learning (Banks & Stytz, CCGBR00) Learning is an essential ability of intelligent systems (NRC, 1998) To realize the full benefit of a human behavior model within an intelligent simulator,…the model should incorporate learning (Hunter et al., CCGBR00) Successful employment of human behavior models…requires that [they] possess the ability to integrate learning (Banks & Stytz, CCGBR00) Status: No CGF simulator has been deployed with learning (D. Reece, 2003) Problems: Performance (costly training), overtraining, behavioral accuracy (e.g., learned behaviors may become unpredictable), constraint violations (learned behaviors do not follow doctrine), difficult to isolate the utility of learning (Petty, CGFBR01) No CGF simulator has been deployed with learning (D. Reece, 2003) Problems: Performance (costly training), overtraining, behavioral accuracy (e.g., learned behaviors may become unpredictable), constraint violations (learned behaviors do not follow doctrine), difficult to isolate the utility of learning (Petty, CGFBR01)

15 15 Industry: Learning in Video and Computer Games Focus: Increase sales via enhanced gaming experience Simulators: Many! (e.g., SimCity, Quake, SoF, UT) Target: Control avatars, unit behaviors Simulators: Many! (e.g., SimCity, Quake, SoF, UT) Target: Control avatars, unit behaviors Status Few deployed systems have used learning (Kirby, 2004): e.g., 1.Black & White: on-line, explicit (player immediately reinforces behavior) 2.C&C Renegade: on-line, implicit (agent updates set of legal paths) 3.Re-volt: off-line, implicit (GA tunes racecar behaviors prior to shipping) Problems: Performance, constraints (preventing learning something dumb), trust in learning system Few deployed systems have used learning (Kirby, 2004): e.g., 1.Black & White: on-line, explicit (player immediately reinforces behavior) 2.C&C Renegade: on-line, implicit (agent updates set of legal paths) 3.Re-volt: off-line, implicit (GA tunes racecar behaviors prior to shipping) Problems: Performance, constraints (preventing learning something dumb), trust in learning system Evidence of commitment Developers: keenly interested in building AIs that might learn, both from the player & environment around them. (GDC03 Roundtable Report) Middleware products that support learning (e.g., MASA, SHAI, LearningMachine) Long-term investments in learning (e.g., iKuni, Inc.) A computer that learns is worth 10 Microsofts. (B. Gates, 2004) Developers: keenly interested in building AIs that might learn, both from the player & environment around them. (GDC03 Roundtable Report) Middleware products that support learning (e.g., MASA, SHAI, LearningMachine) Long-term investments in learning (e.g., iKuni, Inc.) A computer that learns is worth 10 Microsofts. (B. Gates, 2004)

16 16 Focus: Several research thrusts Academia: Learning in Interactive Computer Games Status: Publication options (specific to AI & gaming) AAAI symposia and workshops (several) AAAI04 Workshop on Challenges in Game AI International Conference on Computers and Games Journals: J. of Game Development, Int. Computer Games J. AAAI symposia and workshops (several) AAAI04 Workshop on Challenges in Game AI International Conference on Computers and Games Journals: J. of Game Development, Int. Computer Games J. Game engines (e.g., GameBots, ORTS, RoboCup Soccer Server) Use (other) open source engines (e.g., FreeCiv, Stratagus) Representation (e.g., Forbus et al., 2001; Houk, 2004; Munoz-Avila & Fisher, 2004) Knowledge acquisition (e.g., Hieb et al., 1995) Supervised learning of lower-level behaviors (e.g., Geisler, 2002) Learning plans (e.g., Fasciano, 1996) Learning opponent unit models (e.g., Laird, 2001; Hill et al., 2002) Learning to provide advice (e.g., Sweetser & Dennis, 2003) Learning hierarchical knowledge (e.g., van Lent & Laird, 1998) Learning rule preferences (e.g., Ponsen, 2004) Game engines (e.g., GameBots, ORTS, RoboCup Soccer Server) Use (other) open source engines (e.g., FreeCiv, Stratagus) Representation (e.g., Forbus et al., 2001; Houk, 2004; Munoz-Avila & Fisher, 2004) Knowledge acquisition (e.g., Hieb et al., 1995) Supervised learning of lower-level behaviors (e.g., Geisler, 2002) Learning plans (e.g., Fasciano, 1996) Learning opponent unit models (e.g., Laird, 2001; Hill et al., 2002) Learning to provide advice (e.g., Sweetser & Dennis, 2003) Learning hierarchical knowledge (e.g., van Lent & Laird, 1998) Learning rule preferences (e.g., Ponsen, 2004)

17 17 Example integrations Academia: Learning in Interactive Computer Games (cont.) Name + ReferenceGame EngineLearning ApproachTasks (Goodman, AAAI93)BilestoadProjective visualizationFighting maneuvers CAPTAIN (Hieb et al., CCGFBR95) ModSAFMultistrategy (e.g., version spaces) Platoon placement MAYOR (Fasciano, 96 Dept. of CS, U. Chicago TR 96-05) SimCityCase-based reasoningCity development (Fogel et al., CCGFBR96)ModSAFGenetic programmingTank movements KnoMic (van Lent & Laird, ICML98) ModSAFRule condition learning in SOAR Aircraft maneuvers (Geisler, 2002)Soldier of FortuneMultiple (e.g., boosting backprop) FPS action selection (Sweetser & Dennis, 2003)Tubby TerrorRegressionAdvice generation (Chia & Williams, BRIMS03)TankSoarNaïve Bayes classificationTank behaviors (Ponsen, 2004)Wargus/StratagusGenetic algorithms (dynamic scripting) Strategic rule selection

18 18 Summary: Some Additional Challenges with Embedding Learning in Gaming Simulators 1.Low cpu requirements (e.g., in real-time games) 2.Constraining learned knowledge Must not violate expectations 3.Learning & reasoning (e.g., planning) 4.Isolating learning contributions (for evaluation) 1.Low cpu requirements (e.g., in real-time games) 2.Constraining learned knowledge Must not violate expectations 3.Learning & reasoning (e.g., planning) 4.Isolating learning contributions (for evaluation)

19 19 Specification for Integrating Learning Systems with Gaming Simulators 1.Simplifies integration! Interests ML researchers Interests game developers 2.Learning focus concerns at least three types of models: Task (e.g., learn how to perform, or advise on, a task) Player (e.g., learn a human players strategies) Game (e.g., learn its objects, their relations & functions) State interpretation/abstraction 3.Learning methods: A wide variety They should be able to output their learned behaviors for inspection (e.g., by game developers) 4.Game engines: Those with challenging learning tasks i.e., large hypothesis spaces, knowledge-intensive 5.Supports reuse via modularity (to be at all feasible) Abstracts interface definitions from game & task models 6.Free (unlike some similar commercial tools) Preferably, open source 1.Simplifies integration! Interests ML researchers Interests game developers 2.Learning focus concerns at least three types of models: Task (e.g., learn how to perform, or advise on, a task) Player (e.g., learn a human players strategies) Game (e.g., learn its objects, their relations & functions) State interpretation/abstraction 3.Learning methods: A wide variety They should be able to output their learned behaviors for inspection (e.g., by game developers) 4.Game engines: Those with challenging learning tasks i.e., large hypothesis spaces, knowledge-intensive 5.Supports reuse via modularity (to be at all feasible) Abstracts interface definitions from game & task models 6.Free (unlike some similar commercial tools) Preferably, open source

20 20 Outline (cont) 1.Objective 2.Specification of a testbed 3.Intended functionality Interaction Types of use Open issues 4.Example of use 5.Status and Goals 1.Objective 2.Specification of a testbed 3.Intended functionality Interaction Types of use Open issues 4.Example of use 5.Status and Goals

21 21 Editors TIELT User Reasoning & Learning System Reasoning & Learning System Learning System #1 Learning System #n... Knowledge Bases Game Model Description Task Descriptions Game Interface Description Learning Interface Description Evaluation Methodology Description Displays Prediction Display Advice Display Evaluation Display Game Engine Game Engine Stratagus FreeCiv Game Player(s) Learned Knowledge TIELT: Testbed for Integrating and Evaluating Learning Techniques

22 22 TIELT Knowledge Bases Game Model Description Task Descriptions Game Interface Description Learning Interface Description Evaluation Methodology Description Defines communication processes with the game engine Defines communication processes with the learning system Defines interpretation of the game e.g., objects, operators, behaviors model, tasks, initial state Defines the selected learning and performance tasks Selected from the game model description Defines the empirical evaluation to conduct

23 23 Example learning functionalities supported Data Sources and Targeted Functionalities 1.Learning from observations (e.g., behavioral cloning) 2.Active learning 3.Learning from advice (requires inputs from user) 4.Learning to advise 5.… 1.Learning from observations (e.g., behavioral cloning) 2.Active learning 3.Learning from advice (requires inputs from user) 4.Learning to advise 5.… Data sources 1.Game (world) model (possibly incomplete, incorrect) 2.Simulator Passive state observations (e.g., behavioral cloning) Active testing (e.g., apply an action in a state) 3.Humans Advice 1.Game (world) model (possibly incomplete, incorrect) 2.Simulator Passive state observations (e.g., behavioral cloning) Active testing (e.g., apply an action in a state) 3.Humans Advice

24 24 Editors TIELT User Reasoning & Learning System Reasoning & Learning System Learning System #1 Learning System #n... Knowledge Bases Game Model Description Task Descriptions Game Interface Description Learning Interface Description Evaluation Methodology Description Displays Prediction Display Advice Display Evaluation Display Game Engine Game Engine Stratagus FreeCiv Game Player(s) Example TIELT Usage: Controlling a Game Character Raw State Processed StateDecision Action

25 25 Editors TIELT User Reasoning & Learning System Reasoning & Learning System Learning System #1 Learning System #n... Knowledge Bases Game Model Description Task Descriptions Game Interface Description Learning Interface Description Evaluation Methodology Description Displays Prediction Display Advice Display Evaluation Display Game Engine Game Engine Stratagus FreeCiv Game Player(s) Example TIELT Usage: Advising a Game Player Raw State Processed State Decision, Reason Advice, Explanation

26 26 Editors TIELT User Reasoning & Learning System Reasoning & Learning System Learning System #1 Learning System #n... Knowledge Bases Game Model Description Task Descriptions Game Interface Description Learning Interface Description Evaluation Methodology Description Displays Prediction Display Advice Display Evaluation Display Game Engine Game Engine Stratagus FreeCiv Game Player(s) Example TIELT Usage: Predicting a Game Players Actions Raw State Processed State Prediction, Reason Prediction, Explanation

27 27 Editors TIELT User Reasoning & Learning System Reasoning & Learning System Learning System #1 Learning System #n... Knowledge Bases Game Model Description Task Descriptions Game Interface Description Learning Interface Description Evaluation Methodology Description Displays Prediction Display Advice Display Evaluation Display Game Engine Game Engine Stratagus FreeCiv Game Player(s) Example TIELT Usage: Updating a Game Model Raw State Processed State Edit

28 28 Editors TIELT User Reasoning & Learning System Reasoning & Learning System Learning System #1 Learning System #n... Knowledge Bases Game Model Description Task Descriptions Game Interface Description Learning Interface Description Evaluation Methodology Description Displays Prediction Display Advice Display Evaluation Display Game Engine Game Engine Stratagus FreeCiv Game Player(s) Learned Knowledge Example TIELT Usage: Building a Task/Player Model Raw State Processed State Model

29 29 Game Developer Intended Use Cases 1.Define/store game engine interface 2.Define/store game model 3.Select learning system & interface 4.Select learning and performance tasks 5.Define (or select) evaluation methodology 6.Run experiments 7.Analyze displayed results 1.Define/store game engine interface 2.Define/store game model 3.Select learning system & interface 4.Select learning and performance tasks 5.Define (or select) evaluation methodology 6.Run experiments 7.Analyze displayed results ML Researcher 1.Define/store learning system interface 2.Select game engine & interface 3.Select game model 4.Select learning and performance tasks 5.Define (or select) evaluation methodology 6.Run experiments 7.Analyze displayed results 1.Define/store learning system interface 2.Select game engine & interface 3.Select game model 4.Select learning and performance tasks 5.Define (or select) evaluation methodology 6.Run experiments 7.Analyze displayed results Repository of learning systems, learning interface descriptions Repository of game engines, game interface descriptions

30 30 Some Open Questions 1.Game Model: What representation? STRIPS operators? Hierarchical task networks? Explicit constraints? How to communicate it to the learning system? Should it instead be maintained in the learning system? 2.What standards for: Game engine message passing Learning system message passing Output format for learned knowledge 3.Support both on-line and off-line studies? 4.What representations for advice and explanations? 5.How to explicitly represent & apply constraints on learned knowledge? 6.How to evaluate TIELTs utility? 1.Game Model: What representation? STRIPS operators? Hierarchical task networks? Explicit constraints? How to communicate it to the learning system? Should it instead be maintained in the learning system? 2.What standards for: Game engine message passing Learning system message passing Output format for learned knowledge 3.Support both on-line and off-line studies? 4.What representations for advice and explanations? 5.How to explicitly represent & apply constraints on learned knowledge? 6.How to evaluate TIELTs utility?

31 31 Outline (cont) 1.Objective 2.Specification of a testbed 3.Intended functionality Interaction Types of use Open issues 4.Example of use Demonstration of initial GUI Simple city placement task 5.Status and Goals 1.Objective 2.Specification of a testbed 3.Intended functionality Interaction Types of use Open issues 4.Example of use Demonstration of initial GUI Simple city placement task 5.Status and Goals

32 32 Outline (cont) 1.Objective 2.Specification of a testbed 3.Intended functionality Interaction Types of use Open issues 4.Example of use 5.Status and Goals 1.Objective 2.Specification of a testbed 3.Intended functionality Interaction Types of use Open issues 4.Example of use 5.Status and Goals

33 33 Status and Goals TIELT Specification TIELT (Initial GUI) Matt Molineaux

34 34 2.ORTS (Open Real-Time Strategy (RTS)) project: Open source RTS game engine Free Flexible game specification (via scripts) Hack-free server-side simulation Open message protocol: Players have total control Prefer ORTS to Stratagus? 2.ORTS (Open Real-Time Strategy (RTS)) project: Open source RTS game engine Free Flexible game specification (via scripts) Hack-free server-side simulation Open message protocol: Players have total control Prefer ORTS to Stratagus? Status and Goals: Recent Influences 1.Full Spectrum Command (van Lent et al., 2004) Multiple AI systems, one game engine 1.Full Spectrum Command (van Lent et al., 2004) Multiple AI systems, one game engine 3.Collaboration with Lehigh University (Asst. Prof. H. Muñoz-Avila) Extended Hierarchical Task Network (HTN) process representation for the Game Models tasks? Fall 2004 PhD candidate: First to integrate ML with Stratagus Fall 2004 student: Will develop Game Models for us 3.Collaboration with Lehigh University (Asst. Prof. H. Muñoz-Avila) Extended Hierarchical Task Network (HTN) process representation for the Game Models tasks? Fall 2004 PhD candidate: First to integrate ML with Stratagus Fall 2004 student: Will develop Game Models for us

35 35 Conclusion Objective Support the machine learning community by providing an API for a set of gaming engines, the ability to select a wide variety of learning and performance tasks, and an editor for specifying and conducting an evaluation methodology. Status Started 12/03, effectively Initial GUI implementation Many open research questions Started 12/03, effectively Initial GUI implementation Many open research questions Goals 9/04: First complete implementation Incrementally integrate with game engines, learning systems Document & publicize for use to gain ML interest Subsequently, seek military/industry interest 9/04: First complete implementation Incrementally integrate with game engines, learning systems Document & publicize for use to gain ML interest Subsequently, seek military/industry interest And game-developer community? Other research communities?

36 36 Backup Slides

37 37 Goal: Wargaming testbed for the machine learning community –Explore learning techniques in the context of todays latest simulations & video games –Facilitate exploration of strategies and what if scenarios –Provide common platform for evaluating different learning techniques New Learning Techniques Development Environment Video Wargaming Testbed API TIELT: Initial Vision (DARPA, 11/13/03) Technical Approach: Enable insertion of learning/KA techniques into state-of-the-art video combat & strategy games –Create API for integrating learning into selected video games e.g., comm. module, socket interface, client-server comms protocol & language –Create API that enables learning in computer generated forces (CGF) tools

38 38 Functionality: Supervised learning using a passive dataset API for Isolated Studies Performance (Classifier): Task: Classification Interface: Input: None Output: Common access format (across all tasks & datasets) Learning: Task: Varies (e.g., tree, weight settings) Interface: Input: Data instance or set –Common format (across all tasks & systems) Output: Classification decision Performance (Classifier): Task: Classification Interface: Input: None Output: Common access format (across all tasks & datasets) Learning: Task: Varies (e.g., tree, weight settings) Interface: Input: Data instance or set –Common format (across all tasks & systems) Output: Classification decision

39 39 API for Cognitive Learning Functionality: Learning by doing/being told/observation/etc. Performance (Cognitive System): Task: Varied (e.g., planning, design, diagnosis, …, classification) Interface: Input: Action Output: Current state Learning: Task: Varies (e.g., rule application conditions) Interface: Input: Processed current state Output: Decision Performance (Cognitive System): Task: Varied (e.g., planning, design, diagnosis, …, classification) Interface: Input: Action Output: Current state Learning: Task: Varies (e.g., rule application conditions) Interface: Input: Processed current state Output: Decision

40 40 RoleFocusState-of-the-ArtAI Needs Tactical Enemies Challenge human playerCheats, Scripts using FSMs, Path planning, expert systems Situation assessment, user modeling, spatial & temporal reasoning, planning, plan recognition, learning PartnersCooperation & coordination w/ human Scripted responses to specific commands Speech recognition, NLP, gesture recognition, user modeling, adaptation Support Characters Guide/interact with humanCanned responsesNL understanding & generation, path planning, coordination Strategic Opponents Develop high-level strategy, allocate resources. & issue unit- level commands Cheating, etc.Integrated planning, commonsense reasoning, spatial reasoning, plan recognition, resource allocation UnitsCarry out-high level commands autonomously FSMs and path planning Commonsense reasoning & coordination CommentatorsObserve and comment on game play NL generation, plan recognition Commercial Game Roles for AI (Laird & van Lent, 2001)

41 41 TIELT Display Editors Evaluation Methodology Game Interface Editor Percepts User Learning Interface Editor Game Model Editor Task Editor Game Model Description Task Descriptions Perf. Task Display Evaluation Display Evaluator Action Translator (Mapper) Learning Outputs Actions Learning System(s) Learning System(s) System #1 System #2 System #n... Translated Model (Subset) Learning Task Game Interface Description Learning Interface Description Learning Translator (Mapper) Controller Current State Model Updater Database Evaluation Settings Stored State Advice Display Database Engine State Game Engine Game Engine Stratagus FreeCiv

42 42 TIELT Editors Game Interface Editor Sensors User Game Model Editor Game Model Description Updates Game Interface Description Action Translator Actions Game Engine Game Engine Current State 1 2 4 3 4 In Game Engine, game begins and the colony pod is created and placed. 1 The Game Engine sends a See sensor message stating where the pod is. The message template provides updates to the Game Model Description, which tell the Current Model that there is a pod at the location See describes. 4 2 The Model Updater receives the sensor message and finds the corresponding message template in the Game Interface Description. 3 Controller Model Updater 3 The Model Updater notifies the Controller that the See action event has occurred. 5 5 1. Sensing the Game State

43 43 TIELT Editors User Learning Interface Editor Task Editor Task Descriptions Learning Translator Translated Model (Subset) Learning Interface Description Action Translator Learning Outputs The Controllor notifies the Learning Translator that it has received a See message. The Learning Translator finds a city location task which is triggered by the See message. It queries the controller for the learning mode, then creates a TestInput message to send to the learning system with information on the pods location and the map from the Current State. The Learning System(s) transmit output to the Action Translator. The Learning Translator transmits the TestInput message to the appropriate Learning System(s). 1 2 22 3 Controller Learning System(s) Learning System(s) System #1 System #2 System #n... Current State Stored State 2. Fetching Decisions from the Learning System 1 4 2 3 4

44 44 TIELT Editors Game Interface Editor User Action Translator Actions Game Engine Game Engine Current State 1 2 4 The Action Translator receives a TestOutput message from a Learning System. The Action Translator finds the TestOutput message template and determines that it is associated with the city location task, and builds a MovePod operator (defined by the Current State) with the parameters of TestOutput. The Game Engine receives Move and updates the game to move the pod toward its destination, or The Action Translator determines that the Move Action from the Game Interface Description is triggered by the MovePod Operator and binds Move using information from MovePod, then sends Move to the Game Engine. Learning Interface Editor 2 3 Game Interface Description Learning Interface Description Display Advice Display 3 The Advice Display receives Move and displays advice to a human player on what to do next. 5 1 4 2 3 3. Acting in the Game World 5

45 45 TIELT Editors Task Editor Task Descriptions Model Evaluation Display Evaluation Display Evaluator Current State The Evaluator is triggered by the Controller according to a trigger from the Evaluation Settings. 1 The Evaluator obtains performance metrics from each Task and calculates them on the Current State. 2 The Evaluator sends the new metrics values to the Evaluation Display, which updates with the new information. 3 2 2 3 Controller 1 4. Displaying an Operation to the User

46 46 TIELT When in Record mode, the Controller triggers the Database Engine when the state updates. 1 The Database Engine records the Current State in a Database for later use. 2 Later, in Playback mode, the Controller triggers the Database Engine after the Learning System indicates readiness. 3 Learning Translator (Mapper) Controller Current State Database Stored State Database Engine State 1 2 2 3 The Database Engine then queries a Database and retrieves a Stored State. 4 Finally, the Controller notifies the Learning System that an update has arrived and to query the Stored State for message info. 5 4 4 5 5 5. Retrieving States from a Database


Download ppt "1 Towards a Testbed for Evaluating Learning Systems in Gaming Simulators 21 March 2004 Symposium on Reasoning and Learning Outline: 1.Objective 2.Specification."

Similar presentations


Ads by Google