Presentation is loading. Please wait.

Presentation is loading. Please wait.

SteerBench: a benchmark suite for evaluating steering behaviors Authors: Singh, Kapadia, Faloutsos, Reinman Presented by: Jessica Siewert.

Similar presentations


Presentation on theme: "SteerBench: a benchmark suite for evaluating steering behaviors Authors: Singh, Kapadia, Faloutsos, Reinman Presented by: Jessica Siewert."— Presentation transcript:

1 SteerBench: a benchmark suite for evaluating steering behaviors Authors: Singh, Kapadia, Faloutsos, Reinman Presented by: Jessica Siewert

2 Content of presentation Introduction Previous work The Method Assessment

3 Introduction – Context and motivation – Steering of agents – Objective comparison – Standard? – Test cases and scoring, user evaluation – Metric scoring – Demonstration

4 Introduction – Previous work There is not really anything like it yet (Nov ‘08)

5 Introduction - Promises Evaluate objectively Help researchers Working towards a standard for evaluation Take into account: – Cognitive decisions – Situation-specific aspects

6 The test cases – Simple validation scenarios – Basic one – on – one interactions – Agent interactions including obstacles – Group interactions – Large-scale scenarios

7 The user’s opinion Rank on overal score across test cases (comparing) Rank algorithms based on – a single case, or – one agent’s behavior Pass/fail Visually inspect results Examine detailed metrics of the performance

8 The metric Number of collisions Time efficiency Effort efficiency Penalties?

9 Movies…

10 Developments since then Ioannis Karamouzas, Peter Heil, Pascal Beek, Mark H. Overmars, A Predictive Collision Avoidance Model for Pedestrian Simulation, Proceedings of the 2nd International Workshop on Motion in Games, November 21-24, 2009, Zeist, The Netherlands Ioannis Karamouzas, Peter Heil, Pascal Beek, Mark H. Overmars, A Predictive Collision Avoidance Model for Pedestrian Simulation, Proceedings of the 2nd International Workshop on Motion in Games, November 21-24, 2009, Zeist, The Netherlands Shawn Singh, Mubbasir Kapadia, Billy Hewlett, Glenn Reinman, Petros Faloutsos, A modular framework for adaptive agent-based steering, Symposium on Interactive 3D Graphics and Games, February 18-20, 2011, San Francisco, California Shawn Singh, Mubbasir Kapadia, Billy Hewlett, Glenn Reinman, Petros Faloutsos, A modular framework for adaptive agent-based steering, Symposium on Interactive 3D Graphics and Games, February 18-20, 2011, San Francisco, California Suiping Zhou, Dan Chen, Wentong Cai, Linbo Luo, Malcolm Yoke Hean Low, Feng Tian, Victor Su-Han Tay, Darren Wee Sze Ong, Benjamin D. Hamilton, Crowd modeling and simulation technologies, ACM Transactions on Modeling and Computer Simulation (TOMACS), v.20 n.4, p.1-35, October 2010 Suiping Zhou, Dan Chen, Wentong Cai, Linbo Luo, Malcolm Yoke Hean Low, Feng Tian, Victor Su-Han Tay, Darren Wee Sze Ong, Benjamin D. Hamilton, Crowd modeling and simulation technologies, ACM Transactions on Modeling and Computer Simulation (TOMACS), v.20 n.4, p.1-35, October 2010

11 Experiments – Claim recall Evaluate objectively Help researchers Working towards a standard for evaluation

12 Assessment – good things All the measured variables seem logical (Too?) Extensive variable set, with option to expand Customized evaluation Cheating not allowed – collision penalties – fail constraint – goal constraint Layered set of test cases

13 Assessment The measurements all seem to be approximately the same User test makes the difference? Who are these users? Examine, inspect, all vage terms What about the objective of objectiveness?

14 Assessment How good is it to be general How general/specific is this method? Time efficiency VS. Effort efficiency Should it be blind for the algorithm itself? Penalties, fail and goal constraints not specified!

15 Assessment – scoring(1/2) The test cases are clearly specified. But it is not specified HOW a GOOD agent SHOULD react, though they say there is such a specification How can you get cognitive decisions out of only position, direction and a goal?

16 Assessment – scoring(2/2) “Scoring not intended to be a proof of an algorithm’s effectiveness.” How do you interpreted scores and who wins? – “B is slightly better on average, but A has the highest scores.”

17 Assessment – final questions Can this method become a standard? What if someone claims to be so innovative this standard does not apply to them? Nice first try, though! Getty images

18


Download ppt "SteerBench: a benchmark suite for evaluating steering behaviors Authors: Singh, Kapadia, Faloutsos, Reinman Presented by: Jessica Siewert."

Similar presentations


Ads by Google