Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond Flexible.

Similar presentations


Presentation on theme: "Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond Flexible."— Presentation transcript:

1 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ Flexible Agent Based Simulation for Pedestrian Modelling on GPU Hardware Paul Richmond The Department of Computer Science University of Sheffield, UK paul@dcs.shef.ac.uk www.dcs.shef.ac.uk/~paul Richmond Paul, Coakley Simon, Romano Daniela, "Cellular Level Agent Based Modelling on the Graphics Processing Unit (with FLAME GPU)", To appear in the special issue: "Parallel and Ubiquitous methods and tools in Systems Biology" of the international journal: Briefings in Bioinformatics 2010 Richmond Paul, Coakley Simon, Romano Daniela (2009), "Cellular Level Agent Based Modelling on the Graphics Processing Unit", Proc. of HiBi09 - High Performance Computational Systems Biology, 14-16 October 2009,Trento, Italy Richmond Paul, Coakley Simon, Romano Daniela(2009), "A High Performance Agent Based Modelling Framework on Graphics Card Hardware with CUDA", Proc. of 8th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2009), May, 10–15, 2009, Budapest, Hungary Richmond Paul, Romano Daniela(2008), "A High Performance Framework For Agent Based Pedestrian Dynamics On GPU Hardware", Proceedings of EUROSIS ESM 2008 (European Simulation and Modelling), October 27-29, 2008, Universite du Havre, Le Havre, France

2 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ Introduction and Scope Agent Based Modelling (ABM) Emergence of Complex natural behaviour for simple rules Individuals are agents with memory Update own memory by considering neighbours Of Pedestrian Behaviour Continuous space mobile agents Discrete time steps On the GPU Why?: Performance and real time visualisation Aim is for Flexibility: Want to be able to harness the GPUs power without modellers having to understand GPU programming Not Continuum based (Treuille 06) or using mobile discrete agents (D’Souza 07)

3 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ FLAME and FLAME GPU What is FLAME (and what FLAME is not)? Flexible Large-scale Agent Modelling Environment XML Model specification based on the X-Machine (state based agents) Template system for generating simulation code Why extend FLAME to the GPU Complete modelling environment (beyond that of simple swarms) Formal and portable specification technique based on the X-Machine Many existing models to be used for benchmarking What is FLAME GPU Data parallel implementation of FLAME using CUDA (with real time visualisation) Cost effective solution for high performance ABM XSLT Driven Templates (rather than the XParser) Simulation Code XSLT Simulation Templates ■■■ XSLT Processor ■■■ Simulation Program XML Schemas XML Model File Scripted Behaviour

4 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ Programming the GPU Purpose of the GPU Data parallel device for operation on streams of data Programming for General Purpose Use Graphics API Technique: Not ideal High Level Alternatives Brook GPU (Buck 04): SIMD Stream programming extension for C Sh (McCool 02): C++ language with a Compiler for GPU backends Hardware Specific Stream SDK: Low level ATI specific native instruction set and High Level support with Brook + CUDA: NVIDIA programming for GPU using a compiler and a C syntax with extensions OpenCL: New standard but growing, limited support CUDA GPU is a coprocessor to CPU (with its own global memory) Many light weight parallel threads grouped into regular sized blocks (execution units) Threads in same execution unit perform the instructions (SIMD)

5 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ Mapping Agent Functions to the GPU __FLAME_GPU_FUNC__ int input_function( xmachine_memory_pedestrian* xmemory, xmachine_message_pedestrian_location_list* location_messages) { /* Get the first message */ xmachine_message_pedestrian_location* location_message = get_first_pedestrian_location_message(location_messages); /* Repeat untill there are no more messages */ while(location_message) { /* Process the message */ if distance_check(xmemory, location_message) { updateSteerVelocity(xmemory, location_message); } /* Get the next message */ location_message = get_next_pedestrian_location_message(location_message, location_messages); } /* Update any other xmemory variables */ xmemory->x += xmemory->vel_x*TIME_STEP;... return 0; } Each transition function is wrapped by a GPU kernel Each agent is a thread performing the function Functions can input and output messages Functions can output new agents (agent birth) An agent can be removed (agent death) by returning non 0 value

6 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ Implementation Techniques used within FLAME GPU Avoiding diversity across agents in execution blocks Agents are stored and processed in state lists to avoid conditional branching Sparse lists are compacted during births, filters and optional message outputs Ensure data access is performed efficiently Lists are stored using an Structure of Arrays (SoA) rather than an Array of Structures (AoS) typedef struct agent{ float x; float y; } xm_memory_agent_list [N]; typedef struct agent_list{ float x[N]; float y[N]; } xm_memory_agent_list; 0123N … 012N3 … 012N3 … … …

7 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ Message Communication Brute Force Communication Tile blocks of message lists into shared memory to reduce global memory access (Nyland 07) Use of Shared memory has roughly an order of magnitude performance impact. Spatially Partitioned Communication Split the environment into uniform grid based on the message radius. Each agent reads all messages from each neighbouring partition Requires the use of parallel sort and a boundary matrix Roughly 2/3 messages are outside the message radius but much better than O(n)² Discrete Agent Message Communication (CA) Large block of messages loaded into shared memory Or use the texture cache to minimise global reads.

8 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ A Pedestrian Model Example Inter agent interaction (using spatially partitioned messaging) is based on a hybrid of Reynolds and Social Forces Social repulsion force Navigates pedestrians to area of low concentration Limited forward Vision Preference over agents in direct line of sight Scaled depending on distance to neighbour Close Range Interaction Force Very short range with no limited vision Acts as collision avoidance

9 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ Visualisation and Animation Technique Agent data is already on the GPU for visualisation Need to draw a copy of the agent for each in the simulation (instancing) The model geometry can be stored on the GPU to reduce draw calls Only requires a single call per agent Each agent is displaced an orientated. Use Levels of Detail to avoid rendering high detailed models for every agent On the GPU so must remain parallel Sort the agents by LOD Level and render in groups Animation - Very simple Interpolate between 2 key frames Rotate the model depending on velocity direction

10 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ Demo Agents coloured by LOD

11 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ Performance Results Observables Performance Dependant on Communication Radius Larger communication = less partitions = more agents considered per update LOD technique has a cost Don’t use for small populations Very large population sizes possible in real time

12 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ Environment Collision Avoidance Discrete grid of agents to encode the environment Static Discrete Agents Repulsive forces direct agents from wall Automatically generated in advance Continuous Pedestrian Agents read discrete messages Apply a collision force Displace pedestrian agents by height value

13 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ Long Range Navigation Many agents following similar paths so a global solution is used Fluid flow route for each path through the environment Calculated offline in advance by backtracking from exit point Smooth movement around obstacles Discrete Agents also responsible for pedestrian birth allocation

14 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/

15 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ Conclusions and Future Work Summary Flexible agent architecture for the GPU suitable for force models Easily extendible Massive performance/cost benefits Scope for Future Work Multi GPU Would enable extremely large populations of systems to be simulated For Spatial partitioning only partition boundaries would need to be communicated between GPU devices Improve pedestrian models Improved collision detection (more accurate) Long range individual path planning without flow grids Physically accurate animation and movement Much larger models (need appropriate scenarios)

16 Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond paul@dcs.shef.ac.uk http://www.dcs.shef.ac.uk/~paul/ References A. Treuille, S. Cooper, and Z. Popović, "Continuum crowds," in SIGGRAPH '06: ACM SIGGRAPH 2006 Papers. New York, NY, USA: ACM, 2006, pp. 1160-1168. R. M. D’Souza, M. Lysenko, and K. Rahmani. Sugarscape on steroids: simulating over a million agents at interactive rates. In Proceedings of Agent2007, 2007. Samuel Eilenberg. Automata, Languages, and Machines. Academic Press, Inc., Orlando, FL, USA, 1974. T. Balanescu, A. J. Cowling, H. Georgescu, M. Gheorghe, M. Holcombe, and C. Vertan. Communicating stream x-machines systems are no more than x-machines. j-jucs, 5(9):494–507, 1999. |http://www.jucs.org/jucs_5_9/communicating_stream_x_machines|. Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, and Pat Hanrahan. Brook for gpus: stream computing on graphics hardware. ACM Trans. Graph., 23(3):777–786, 2004. Michael D. McCool, Zheng Qin, and Tiberiu S. Popa. Shader metaprogramming. In HWWS ’02: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, pages 57– 68, Aire-la-Ville, Switzerland, Switzerland, 2002. Eurographics Association. Lars Nyland, Mark Harris, and Jan Prins. Fast n-body simulation with cuda. In Hubert Nguyen, editor, GPU Gems 3, chapter 31. Addison Wesley Professional, August 2007.


Download ppt "Computer Graphics Group Department of Computer Science Sheffield University, UK Paul Richmond Flexible."

Similar presentations


Ads by Google