UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces Jacob Schrum, Igor Karpov, and Risto Miikkulainen {schrum2,ikarpov,risto}@cs.utexas.edu.

Slides:



Advertisements
Similar presentations
A New World Or People Keep Telling Me This is Ambitious By Jeremiah Lewis.
Advertisements

AI Pathfinding Representing the Search Space
Constructing Complex NPC Behavior via Multi- Objective Neuroevolution Jacob Schrum – Risto Miikkulainen –
Improving Gameplay: Characterising Differences between NPCs & Human Players Jennifer Sandercock.
Half life 2/ Counter Strike: Source bot Charlie Cross CIS
Class Project Due at end of finals week Essentially anything you want, so long as it’s AI related and I approve Any programming language you want In pairs.
Artificial Intelligence in Game Design Introduction to Learning.
Evolving Neural Network Agents in the NERO Video Game Author : Kenneth O. Stanley, Bobby D. Bryant, and Risto Miikkulainen Presented by Yi Cheng Lin.
Artificial Intelligence CSE 191A: Seminar on Video Game Programming Lecture 7: Artificial Intelligence UCSD, Spring, 2003 Instructor: Steve Rotenberg.
CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.
By Jacob Schrum and Risto Miikkulainen
Evolving Multi-modal Behavior in NPCs Jacob Schrum – Risto Miikkulainen –
Collision and Animation Systems in Games Jani Kajala Lead Programmer / Chief Technology Officer, Pixelgene Ltd (0)
Evolution, Brains and Multiple Objectives
Raven Robin Burke GAM 376. Soccer standings Burke, 7 Ingebristen, 6 Buer, 6 Bukk, 6 Krishnaswamy, 4 Lobes, 3 Borys, 2 Rojas, 2 Bieneman, 2.
Improving Gameplay: Characterising Differences between NPCs & Human Players Jennifer Sandercock.
Reinforcement Learning and Markov Decision Processes: A Quick Introduction Hector Munoz-Avila Stephen Lee-Urban
Constructing Intelligent Agents via Neuroevolution By Jacob Schrum
Study on Genetic Network Programming (GNP) with Learning and Evolution Hirasawa laboratory, Artificial Intelligence section Information architecture field.
StarCraft Learning Algorithms By Logan Yarnell, Steven Raines, and Dean Antel.
Evolving Virtual Creatures & Evolving 3D Morphology and Behavior by Competition Papers by Karl Sims Presented by Sarah Waziruddin.
CAP6938 Neuroevolution and Developmental Encoding Real-time NEAT Dr. Kenneth Stanley October 18, 2006.
Mike Taks Bram van de Klundert. About Published 2005 Cited 286 times Kenneth O. Stanley Associate Professor at University of Central Florida Risto Miikkulainen.
Robotics Club: 5:30 this evening
Artificial Intelligence Research in Video Games By Jacob Schrum
Evolving Multimodal Networks for Multitask Games
Evolving Agent Behavior in Multiobjective Domains Using Fitness-Based Shaping Jacob Schrum and Risto Miikkulainen University of Texas at Austin Department.
Game AI Matthew Hsieh Meng Tran. Computer Games Many different genres  Action  Role Playing  Adventure  Strategy  Simulation  Sports  Racing Each.
Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.
CSCE 552 Spring 2010 AI (III) By Jijun Tang. A* Pathfinding Directed search algorithm used for finding an optimal path through the game world Used knowledge.
Botprize 2010 Jacob Schrum, Igor Karpov, and Risto Miikkulainen
CAP6938 Neuroevolution and Artificial Embryogeny Real-time NEAT Dr. Kenneth Stanley February 22, 2006.
An application of the genetic programming technique to strategy development Presented By PREMKUMAR.B M.Tech(CSE) PONDICHERRY UNIVERSITY.
UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces Jacob Schrum Igor V. Karpov
AI in Space Group 3 Raquel Cohen Yesenia De La Cruz John Gratton.
Artificial intelligence In The Gaming Industy. For years games have used Artificial Intelligence, normally we call them bots, like for example your playing.
CAP6938 Neuroevolution and Artificial Embryogeny Real-time NEAT Dr. Kenneth Stanley February 22, 2006.
PSY 626: Bayesian Statistics for Psychological Science
Solving Interleaved and Blended Sequential Decision-Making Problems through Modular Neuroevolution Jacob Schrum Risto Miikkulainen.
On-Line Markov Decision Processes for Learning Movement in Video Games
TORCS WORKS Jang Su-Hyung.
CS b659: Intelligent Robotics
Dr. Kenneth Stanley September 25, 2006
Schedule for next 2 weeks
Creating fuzzy rules from numerical data using a neural network
CH. 1: Introduction 1.1 What is Machine Learning Example:
Chasing Cars How to play Players are in pairs – “cars”
Prof. Marie desJardins September 20, 2010
Evolving Multimodal Networks for Multitask Games
CIS 488/588 Bruce R. Maxim UM-Dearborn
CO Games Development 2 Week 19 Extensions to Finite State Machines
Artificial Intelligence
PSY 626: Bayesian Statistics for Psychological Science
DrillSim July 2005.
Greedy Algorithms / Dijkstra’s Algorithm Yin Tat Lee
Multiagent Systems Game Theory © Manfred Huber 2018.
CIS 488/588 Bruce R. Maxim UM-Dearborn
CIS 488/588 Bruce R. Maxim UM-Dearborn
Interaction with artificial intelligence in games
Dr. Unnikrishnan P.C. Professor, EEE
RL for Large State Spaces: Value Function Approximation
CS Fall 2016 (Shavlik©), Lecture 10, Week 6
Artificial Intelligence
THe University of Georgia Genetic Algorithm BOT
Reinforcement Learning
Artificial Intelligence Lecture No. 28
CIS 488/588 Bruce R. Maxim UM-Dearborn
More on HW 2 (due Jan 26) Again, it must be in Python 2.7.
CSCE 552 Spring 2009 AI (III) By Jijun Tang.
Authors: Jinliang Fan and Mostafa H. Ammar
Presentation transcript:

UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces Jacob Schrum, Igor Karpov, and Risto Miikkulainen {schrum2,ikarpov,risto}@cs.utexas.edu

Our Approach: UT^2 Evolve skilled combat behavior Restrictions/filters maintain humanness Human traces to get unstuck and navigate Filter data to get general-purpose traces Future goal: generalize to new levels Probabilistic judging based on experience Also assume that humans judge well

Bot Architecture

Use of Human Traces

Pure Human Trace Demo

Record Human Games “Wild” pose data “Synthetic” pose data

Index and replay nearest traces Index by navpoints KD-tree of navpoints KD-trees of points within Voronoi cells find nearest navpoint find nearest path Playback Estimate distance D MoveAlong the path for about D Two uses Get unstuck Explore levels

Getting unstuck has highest priority

Unstuck Controller Mix scripted responses and human traces Previous UT^2 used only human traces Human traces also used after repeated failures Stuck Condition Response Still Move Forward Collide With Wall Move Away Frequent Collisions Dodge Away Bump Agent Same Navpoint Human Traces Off Navpoint Grid

Traces used within RETRACE w/low priority

Prolonged Retracing Explore the level like a human Based on synthetic data Lone human running around collecting items Collisions allowed when using RETRACE Humans often bump walls with no problem If RETRACE fails No trace available, or trace gets bot stuck Fall through to PATH module (Nav graph)

Use of Evolution Evolved neural network in Battle Controller defines combat behavior

Constructive Neuroevolution Genetic Algorithms + Neural Networks Build structure incrementally (complexification) Good at generating control policies Three basic mutations (no crossover used) Perturb Weight Add Connection Add Node

Battle Controller Outputs 6 movement outputs Advance Retreat Strafe left Strafe right Move to nearest item Stand still Additional output Jump?

Battle Controller Inputs Pie slice sensors for enemies Ray traces for walls/level geometry Other misc. sensors for current weapon properties, nearby item properties, etc.

Battle Controller Inputs Opponent movement sensors Opponent performing movement action X? Opponents modeled as moving like bot Approximation used

Evolving Battle Controller Used NSGA-II with 3 objectives Damage dealt Damage received (negative) Geometry collisions (negative) Evolved in DM-1on1-Albatross Small level to encourage combat One native bot opponent High score favored in selection of final network Final combat behavior highly constrained

Playing the judging game

Judging When to judge How to judge More likely after more interaction More likely as time runs out Judge if successful judgment witnessed How to judge Assume equal # humans and bots Mostly judge probabilistically Assume target is human if it judged correctly

Results

Judges’ Comments Bot-like Too quick to fire initially after first sight Ability to stay locked onto a target while dodging Lots of jumping Knowledge of levels (where to go) Aggression with inferior weapons Aim is too good most of the time Crouching (Native bots)

Judges’ Comments Human-like Spending time observing Running past an enemy without taking a shot Incredibly poor target tracking Stopping movement to shoot Tend to use the Judging Gun more

Insights Judges expect opponents of similar skill Our bot was too skilled Humans are fallible Would mimicry help? Human judges like to observe Playing the judging game Plan to judge in advance Expecting bots to be like judges

Previous Insights Botprize 2008, 2009: No judging game Judges set traps: follow me, camping, etc. Botprize 2010: Judging game Snap decisions were sometimes correct: how? Still setting traps

What’s Going On? Humans have always been more human Why?! We’re not getting better Need better understanding Native bots are better! Botprize 2010: 35.3982% humanness CEC 2011: Botprize 2008 2/5 fooled Botprize 2009 1/5 fooled Botprize 2010 31.82% humanness CEC 2011 30.00% humanness

Future Competitions How does judging game complicate things? Should human-like = judge-like What is our goal? Human-like players for games? But the native bots are already better! Bots that deliberate/observe/ponder? But at the expense of playing skill

Questions? Jacob Schrum Igor Karpov Risto Miikkulainen {schrum2,ikarpov,risto}@cs.utexas.edu

Auxilliary Slides

Human-like Bot Competition Goal: Make humans think a bot is human Game: Unreal Tournament 2004 Format same as Botprize Judging game Multiple humans vs. multiple bots All humans are judges and players

Judging Game Special judging gun Replaces the Link Gun Primary and alternate fire look identical Primary fire against bots Alternate fire against humans Correctly judge opponent: Kills opponent, +10 frags Incorrectly judge opponent: Shooter dies, -10 frags Bots can use this gun!

Action Filtering Final combat behavior highly constrained Forced lower accuracy for certain weapons Forced to stand still sometimes Sniping, not threatened, high ground Prevented from going to unwanted items Prevented from strafing/retreating into walls Prevented from jumping near walls/opponents Prevented from jumping while still Etc… Evolution constrained to look human

Mutiobjective Optimization Pareto dominance: iff Assumes maximization Want nondominated points NSGA-II used in this work Nondominated

Future Work Human Traces Evolution Generalize to unseen levels Make intelligent decisions about when to jump Use to improve following Supervised learning Evolution Apply to other control modules Apply to selection between modules Reduce reliance on scripted behavior

Future Work Theory of Mind Planned behavior transitions e.g. a chasing bot expects to enter combat mode Mimicry: expectation of similarity Match opponent’s level of dodging, aggressiveness, ammo wasting, etc. Establish communication Deliberation Take time to acknowledge opponents, aim Observe, think about judging