Presentation on theme: "Multi-core/Cell Game Engine Design"— Presentation transcript:
1Multi-core/Cell Game Engine Design Kalloc StudiosMulti-core/Cell Game Engine DesignHenry YuPresident & CEO
2Credentials Worked in video game industry for nearly 20 years Lead Programmer for Sierra-OnlineDirector of Technology for ActivisionTechnical Director for Electronic Arts/ WestwoodSoftware Director for Angel Studios/RockstarSoftware Engineer Director for THQFound Kalloc Studios 2006
4Topics of discussion Game Industry Trends Hardware capabilities comparisonSystem ArchitectureKalloc Studios’ mission
5Game Industry TrendsConsumers demand more realistic visuals, physics interactions, A.I. behaviorsMore game content to give a full, immersive experienceComputer hardware has been evolving to utilize multi-core designFaster iteration time to promote rapid game development
6Hardware comparison of the PlayStation 3 to the Xbox 360 Multi-core vs. Cell based architecture, different synchronization modelsHard to utilize the SPU due to its small amount of local memoryDMA transfers are difficult to structureSlower RSX graphics performanceMemory limitations due to its non-unified memory architectureSlower Blue Ray Rom data throughput
7Fundamental System design Architecture differences between Xbox 360 and PlayStation 3
8Current Kalloc Engine Capabilities Cross platform for Xbox 360, PlayStation 3 and PC720p and 1080p high definition support400 or more characters fully skinned with ~5000 polygons and 91 bones (4 weight influences)50 or more vehicles with ~3000 polygonsNormal mapped charactersAll characters with facial animationsOverall polygon throughput ~24 million polygons per secondFull collision detection with dynamic objects such as characters and vehiclesNPC driving and responding to collisionsNPC have reactive behaviors toward player’s action
9System Architecture Local store and Data Streaming Model Multi-Threaded ArchitectureGraphics SubsystemAnimation SystemPhysics ComponentsAsset Pipeline via Live Update System
10Local Store and Data Streaming Model The architecture works like an array where individual game objects, physics objects, render objects, etc are each allocated in a contiguous chunk of memory reserved for that type of object.The contiguous chunk of memory then can be DMA-d over to the PS3 SPU or even cached on the local memory on PS3.Having objects in contiguous memory is an optimization for the PS3 that will also yield performance increases in Xbox because cache misses will be reduced.
12Multi-threaded Architecture Thread Based Model and SPU Thread Server implementation for task based architectureMulti-Threaded Scheduler manages both blocking and non-blocking processesMulti-stage implementation for data synchronizationN + 1 frame GPU running concurrently with core CPU and SPUs
13Functional Based multi-threaded Architecture Functional Based architecture associates one thread per subsystem. All subsystems are processed simultaneously.Advantages: Very easy to implement since it does not require tasks to be divided and dependencies to be resolved. Suitable for middleware solutions.Disadvantages: Uneven distribution of processing power since one slower task can hold up the rest of processors, making them idle. Mutexes or some other synchronization protection must be used to resolve data dependencies.
14Task Based multi-threaded Architecture Task Based architecture uses all threads to process a subsystem. Subsystems are processed in a given order.Large tasks must be divided into smaller tasks so that they can be distributed along all processorsAdvantage: Extremely even balance of processor power. Virtually eliminates the problem of waiting for the slowest tasks. Due to subsystems being processed in a fixed order, many dependencies are removed, allowing data access without mutex locking.Disadvantage: Difficult to implement since all tasks are required to be divided and dependencies resolved. Hard to use middleware solutions since this architecture is relatively new.
16Solutions to Data Synchronization Mutex locks using critical sectionsData separation using multiple stages (e.g. read and write stages)Local Store Model using ring buffersComponent object level organization to separate data dependency
17Current Graphics System 720p and 1080p native supportInterleaved vertex format with 16-bit normals and UV data to maximize data throughputMulti-level Shadow Map to enhance resolution qualityUse of instancing to increase rendering performanceDepth Of Field effectHigh Dynamic Range lighting with tone mappingParticle effectsHardware instancing for rendering propsScene graph techniques such as octree and occlusion systems to further optimize large scale renderingSupports unlimited number of bones for animation
18Animation System Support unlimited bones per character Key frame compressionQuaternion based interpolationSupport for up to 9 channels of animation: rotation, translation and scaleSupport for overlaid animationsProcedural animation to minimize number of animations in game
19Physics ComponentUse of component system to accommodate different physics middleware and custom physics engine: Havok, Bullet and Ageia PhysXSimple custom physics systemSphere to sphere, box to box, box to sphere, etc collisions2D Grid partition optimizationsPer cell collision detectionSimple vehicle simulation
20Instant Asset Update System for Asset pipeline Instant refreshing of assets without restarting the engine/gameNo intermediate file formats = quick export processInstant feedback for artists and designers to check for data validity and qualityNo overnight build/baking processAsset sharing between designers, artists or programmers within the networkBuilt in support for art outsourcingEasy DVD/Blu Ray burns for archiving and build delivery
21Mission of Kalloc Studios Create a truly next gen multi-platform game engine that maximizes cutting edge hardware such as multi-core and cell architecture and latest graphics rendering capabilitiesCreate innovative and quality game titlesTrain highly motivated talent to become industry specialists