We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byKyle Skinner
Modified over 3 years ago
© Copyright Khronos Group, Page 1 Optimizing OpenGL ES Applications Kristof Beets 3 rd Party Relations Manager - Imagination Technologies
© Copyright Khronos Group, Page 2 Imagination: World Leader in SoC IP Cores Products - Silicon and software IP for multimedia and communication Customers - Global semiconductor, fast-moving fabless businesses and system companies People - >300 with over 75% highly skilled engineers PowerVR MBX de facto standard for Mobile 3D Graphics - In use by 6 of the top 10 semi-conductor companies - Several products already in the market and many more coming soon…
© Copyright Khronos Group, Page 3 PowerVR MBX Family OpenGL ES 1.x Compliant Family Members - PowerVR MBX - PowerVR MBX Lite High Quality, High Performance Texture Filtering - Bi-Linear Filtering with MIP-Mapping at Full Speed PowerVR Texture Compression: 2bpp and 4bpp - Allows higher quality, higher resolution textures for same bandwidth and storage cost High Quality, High Performance Anti-Aliasing Internal True Color DOT3 Per-pixel Lighting Optional PowerVR VGP - Dedicated programmable Vertex Processing Unit - Allows high polygon throughput - Advanced features: Skinning, Curved Surfaces, Lighting
© Copyright Khronos Group, Page 4 PowerVR SGX Family OpenGL ES 2.x Wireless SGX Family Members - SGX510, SGX520, SGX530 - sizes ranging from less than 2mm 2 to 8mm 2 in a 90nm process. Universal Scalable Shader Engine (USSE) - Scalable multi-threaded processing engine - Vertex, Pixel, Video, Imaging, Physics, etc. Processing - Single Compiler Advanced Geometry and Pixel Processing - Procedural Geometry, Higher Order Surfaces, etc. - Advanced Vertex Shaders - Advanced Pixel Shaders such as Parallax bump mapping - Advanced Shadow Techniques - Stencil Shadows, Shadow maps, etc. Programmable Anti-Aliasing On-chip Multiple Render Targets (MRTs) IEEE 32 Bit Floating Point Internal Accuracy Much more…
© Copyright Khronos Group, Page 5 PowerVR Butterflies Demo Demo shows a high number of butterflies in a dynamic flock - Demo originally used for Arcade Hardware - Illustrates Alpha Blending Capability - Illustrates High Number of Textures and Texture Compression Performance for "flocking algorithm only" : Fully Floating Point Algorithm (Without FPU) 72 FPS Fully Fixed Point Algorithm 304 FPS Fully Fixed Point Algorithm with ASM Optimizations 373 FPS Fully Floating Point Algorithm (With FPU) 415 FPS Optimised Algorithm Fully Floating Point (With FPU) FPS
© Copyright Khronos Group, Page 6 Butterflies Demo : Lessons Learned Floating point on non-floating point device is SLOW - about 6x slower in this case Only use Float on non-float device when ABSOLUTELY required ! - Non performance critical situations e.g. offline calculations - Fixed Point accuracy insufficient Use ASM Optimised Fixed Point where required - Only most critical ops need ASM tweaking Use Float if device supports Floating Point - E.g. Floating Point Unit has faster divide op than the Fixed Point Core But do your own benchmarking - Not all algorithms and platforms are equal... Using a smart efficient optimised algorithm benefits all cases... - Essential for high performance on Mobile HW !
© Copyright Khronos Group, Page 7 Reducing Graphics API CPU Load Every API call introduces overhead which costs valuable CPU cycles - Aim to minimize the number of API calls - Matrix Ops and Draw Calls can be expensive How to reduce the number of API calls ? - Batching (grouping) allows reduction of the number of API Calls - Different Texture can break up DrawCalls - Consider using a Texture Atlas / Texture Page -One large texture containing several sub-textures -This makes it possible to draw multiple objects in a single draw call For optimal geometry throughput use Sorted Indexed Triangles - Sorting improves memory access patterns - Sorting makes optimal use of caches - Ideally use strip ordered indexed triangles - PowerVR SDK contains Optimised Geometry Exporter and Geometry Optimisation Lib - Ideally use Multi_Draw_Arrays Extension - Submit multiple strips in a single draw call – minimal API overhead
© Copyright Khronos Group, Page 8 Further Polygon Submission Optimisations Interleave the per vertex data elements (Position, Normal, Color, Etc.) - Keep data that belongs together close together in memory ! Simplify the geometry complexity - Use a polygon reduction algorithm - Use DOT3 lighting or textures to represent fine detail Reduce the size of vertex components - Use smaller formats whenever possible - E.g. Use byte instead of float Dont store constants per vertex - Use Diffuse, Specular, Factor, etc. Colours - Make sure to disable client states that are not required - glEnableClientState / glDisableClientState - Use Vertex Shader constants if available Consider using Level Of Detail (LOD) - Dont use 1000s of polygons for an object 10s of pixels on screen DOT3No DOT3
© Copyright Khronos Group, Page 9 Draw Order / Sorting No need to sort objects front to back - Likely to bottleneck on the CPU due to increase in number of state changes (API overhead) - PowerVR Hardware handles HSR efficiently irrespective of depth render order. Do use High-level Render State Batching - Draw all opaque objects first - Group by number of Texture Layers -E.g. First all Dual Textured Objects and then all Single Textured Objects - Draw all Alpha Blended and Alpha Tested Objects Last Use High-Level Geometry Culling - Do not submit the whole world geometry every frame - Use Fog to hide sudden pop-in effect
© Copyright Khronos Group, Page 10 Let there be Light… OpenGL Lighting is quite complex and can thus be CPU & VGP heavy - OpenGL implementations need to be conformant…so no shortcuts can be taken! Use the simplest light type that works for your application - E.g. parallel lights are cheaper than spot lights Use the fewest number of lights that work for your application Pre-compute lighting whenever you can - Static models with static lights - Pre-compute offline and store in color array or textures Only enable lighting when needed - E.g. On moving objects, or if the light properties are changing - Consider caching lighting if an object stays static for long times - Calculate once use many Could implement your own lighting algorithm - Implement exactly the algorithm you need and want - Use custom IMG Vertex Program (VGP Lighting) or custom code (CPU Lighting) - Can take shortcuts and use hacks... as long as it does the job! - Do verify that its faster and/or better looking than default OpenGL Lighting… Consider pixel lighting - Light maps (as used by most PC Games instead of Vertex Lighting) - DOT3 Per Pixel Lighting
© Copyright Khronos Group, Page 11 Texturing Use Compressed Textures whenever possible ! - Various formats depending on hardware (DXT, PVRTC, ETC, …) - PVRTC2 = 2bpp & PVRTC4 = 4bpp - less bandwidth, less storage, smaller distribution size of the application - Don't use palletised textures - Less quality and less performance then PVRTC2/4 Alternatively use 16bpp Texture Formats - 32bpp is usually overkill on a 16bpp LCD Remember special types - Luminance I8 and Luminance_Alpha IA88 can be useful Always use MIPMapping - Ideally use: LINEAR_MIPMAP_NEAREST - Only use Trilinear when needed Use sensible Texture Sizes - No 1024x1024 Textures for objects that cover a quarter of a QVGA screen - Do use large compressed textures for Texture Pages/Atlas, even 2048x2048 Load all Textures up front - Before rendering create and load all textures - Consider Warm-up phase which touches all textures once - Avoid mid action texture create and uploads and/or changes
© Copyright Khronos Group, Page 12 Multi-texture vs Multi-pass Use Multi-Texturing over Multi-Pass! - Saves draw calls - Considerably reduces vertex processing work - Saves render states changes - Reduces driver overhead and thus CPU Load - Avoids potential Z fighting issues - Subsequent passes with e.g. lighting disabled can yield different depth values 2 Quads 1 Texture Each Multi-Pass 1 Quad 2 Textures in 1 go Multi-Texture Quake 3 : Light Maps Only Quake 3 : Light Maps + Base Map Drawn with a single geometry pass Possible through Multi-Texturing
© Copyright Khronos Group, Page 13 Maintain CPU and GPU Parallelism Normally CPU and 2D/3D Graphics Core work in Parallel… … but some ops can break this parallelism! Do NOT attempt to access the color buffer directly - CPU will stall until HW completes the render - And the GPU stalls while the CPU does its work - Results in lost CPU and GPU performance - Avoid glReadPixels() glCopyTexImage2D() glCopyTexSubImage2D() Find workarounds to avoid accessing the color buffer directly - E.g. use ray casting algorithm for a lens flare effect instead of glReadPixels()
© Copyright Khronos Group, Page 14 Java 3D Graphics M3G (JSR-184) layered on top of OpenGL-ES functionality - OpenGL ES performance recommendations remain valid: - Minimise API calls - especially geometry draw calls - Use Optimised Triangle Strips -Make sure your M3G Exporter tool does a good job… - Batching -E.g. use Group object to bundle meshes - Always flag opaque objects as opaque - Avoid Mid-scene texture uploads/changes - Etc. JAVA makes it easy to mix MIDP 2D and JSR184 based 3D - Do NOT mix 2D and 3D operations within the same frame - Majority of current implementations use CPU for 2D and GPU for 3D - E.g. No MIDP Text Drawing, No Filled Rectangles, etc. within 3D Frame - Future JAVA implementations will solve this performance issue
© Copyright Khronos Group, Page 15 Join the PowerVR Insider Program PowerVR Technical Support & Co-Marketing Programme - Direct Technical Support through , phone & on-site - Assure Optimal Compatibility - Highest Possible Performance - Leading Image Quality - Extensive Support for Key Partners -Including Middleware Vendors, JAVA VM & JSR Vendors, Benchmarks, Launch Titles - Free SDKs including sample code, documentation and extensive toolset - Joint Marketing Activities - Press Releases, Joint Event Participation, Website presence, etc. PowerVR Insider brings the whole ecosystem around 3D Graphics together - From Software Developers to Mobile Phone OEMs - Provide introductions between PowerVR Insiders - Assure co-operation between PowerVR Insiders To join send to:
© Copyright Khronos Group, Page 16 PowerVR MBX Content Selection of available content 3D Golf 3DMarkMobile06 Bling My Ride Chopper Fight Cube Engine Enigmo Everybody's Golf Mobile 2 GeoRallyEx Interstellar Flames Jackpot Casino Kastor Platform Onimusha: Curtain of Darkness Quake III CE Quake Mobile + Expansion Packs Ridge Racer Mobile Scaleform VGx And more than 73 native 3D-Game Titles on SKTelecom GXG Services Middleware + All available content Synergenix Mophun EA/Criterion Renderware TAO Intent Game Player Speed Sphere SSX III Stuntcar Extreme The Lost Sister Tin Star Tony Hawk Pro Skater Tony Hawk's Pro Skater 2 ToyGolf Vijay Singh Pro Golf 2005 Virtual Pool Mobile VIVID UI VIVID Message Xmen Legends Yeti3D Engine
© Copyright Khronos Group, Page 17 Example: Virtual Pool Mobile by Celeris High-detail 3D Polygonal Background Software Version OpenGL-ES PowerVR MBX Hardware Accelerated Version High Quality Texture Filtering & Increased Texture resolution Reflection Mapping Alpha-Blended Menu Increased Performance Higher Screen Resolution & Increased Polygon Counts
© Copyright Khronos Group, Page 18 Example: Quake Mobile by Pulse Interactive Quake III Arena also already available…
© Copyright Khronos Group, Page 19 Any Questions?
© Copyright Khronos Group, Page 1 PowerVR MBX OpenGL ES Demonstrations Kristof Beets 3 rd Party Relations Manager - Imagination Technologies
© Copyright Khronos Group, Page 1 Harnessing the Horsepower of OpenGL ES Hardware Acceleration Rob Simpson, Bitboys Oy.
Maths & Technologies for Games Graphics Optimisation - Batching CO3303 Week 5.
© Copyright Khronos Group, Page 1 PSGL PlayStation Graphics Library Mike Weiblen Sony Computer Entertainment.
OpenGL ES Performance (and Quality) on the GoForce5500 Handheld GPU Lars M. Bishop, NVIDIA Developer Technologies.
Xbox MB system memory IBM 3-way symmetric core processor ATI GPU with embedded EDRAM 12x DVD Optional Hard disk.
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
G30™ A 3D graphics accelerator for mobile devices Petri Nordlund CTO, Bitboys Oy.
CS 450: COMPUTER GRAPHICS REVIEW: INTRODUCTION TO COMPUTER GRAPHICS – PART 2 SPRING 2015 DR. MICHAEL J. REALE.
CS123 | INTRODUCTION TO COMPUTER GRAPHICS Andries van Dam © 1/16 Deferred Lighting Deferred Lighting – 11/18/2014.
University of Texas at Austin CS 378 – Game Technology Don Fussell CS 378: Computer Game Technology Beyond Meshes Spring 2012.
DirectX11 Performance Reloaded Nick Thibieroz, AMD Holger Gruen, NVIDIA.
Emerging Technologies for Games Deferred Rendering CO3303 Week 22.
IN4151 Introduction 3D graphics 1 Introduction to 3D computer graphics part 2 Viewing pipeline Multi-processor implementation GPU architecture GPU algorithms.
Shadows David Luebke University of Virginia. Shadows An important visual cue, traditionally hard to do in real-time rendering Outline: –Notation –Planar.
Graphics Pipeline. Goals Understand the difference between inverse- mapping and forward-mapping approaches to computer graphics rendering Be familiar.
8.1si31_2001 SI31 Advanced Computer Graphics AGR Lecture 8 Polygon Rendering.
Buffers Textures and more Rendering Paul Taylor & Barry La Trobe University 2009.
Real-Time Rendering TEXTURING Lecture 02 Marina Gavrilova.
Compositing and Blending Ed Angel Professor Emeritus of Computer Science University of New Mexico 1 E. Angel and D. Shreiner: Interactive Computer Graphics.
Review of OpenGL Basics Chap. 1 of Orange Book. 2 Contents OpenGL History Execution model Framebuffer State Processing Pipeline Drawing geometry Drawing.
COOL Chips IV A High Performance 3D Graphics Rasterizer with Effective Memory Structure Woo-Chan Park, Kil-Whan Lee*, Seung-Gi Lee, Moon-Hee Choi, Won-Jong.
09/23/03CS679 - Fall Copyright Univ. of Wisconsin Last Time Reflections Shadows Part 1 Stage 1 is in.
4.7. I NSTANCING Introduction to geometry instancing.
Image Fusion In Real-time, on a PC. Goals Interactive display of volume data in 3D –Allow more than one data set –Allow fusion of different modalities.
Building a Dynamic Lighting Engine for Velvet Assassin Christian Schüler.
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
1GR2-00 GR2 Advanced Computer Graphics AGR Lecture 18 Image-based Rendering Final Review of Rendering What We Did Not Cover Learning More...
16.1 Si23_03 SI23 Introduction to Computer Graphics Lecture 16 – Some Special Rendering Effects.
© Copyright Khronos Group, Page 1 Shaders Go Mobile: An Introduction to OpenGL ES 2.0 Tom Olson, Texas Instruments Inc.
Havok FX Physics on NVIDIA GPUs. Copyright © NVIDIA Corporation 2004 What is Effects Physics? Physics-based effects on a massive scale 10,000s of objects.
Game Programming 09 OGRE3D Lighting/shadow in Action 2010 년 2 학기 디지털콘텐츠전공.
Week 7 - Monday. What did we talk about last time? Specular shading Aliasing and antialiasing.
The Graphics Pipeline CS2150 Anthony Jones. Introduction What is this lecture about? – The graphics pipeline as a whole – With examples from the video.
09/16/03CS679 - Fall Copyright Univ. of Wisconsin Last Time Environment mapping Light mapping Project Goals for Stage 1.
Advanced Computer Graphics Depth & Stencil Buffers / Rendering to Textures CO2409 Computer Graphics Week 19.
Status – Week 257 Victor Moya. Summary GPU interface. GPU interface. GPU state. GPU state. API/Driver State. API/Driver State. Driver/CPU Proxy. Driver/CPU.
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
1 Introduction to Computer Graphics with WebGL Ed Angel Professor Emeritus of Computer Science Founding Director, Arts, Research, Technology and Science.
Maths & Technologies for Games DirectX 11 – New Features Tessellation & Displacement Mapping CO3303 Week 19.
Advanced Computer Graphics Advanced Shaders CO2409 Computer Graphics Week 16.
Real-Time Dynamic Shadow Algorithms Evan Closson CSE 528.
GAM666 – Introduction To Game Programming ● Programmer's perspective of Game Industry ● Introduction to Windows Programming ● 2D animation using DirectX.
15.1 Si23_03 SI23 Introduction to Computer Graphics Lecture 15 – Visible Surfaces and Shadows.
Computer Graphics 3 Lecture 6: Other Hardware-Based Extensions Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
Technische Universität München Computer Graphics SS 2014 Graphics Effects Rüdiger Westermann Lehrstuhl für Computer Graphik und Visualisierung.
Graphics Graphics Korea University cgvr.korea.ac.kr 1 7. Speed-up Techniques Presented by SooKyun Kim.
Graphics Hardware CMSC 435/634. Transform Shade Clip Project Rasterize Texture Z-buffer Interpolate Vertex Fragment Triangle A Graphics Pipeline.
You can use 3D graphics to enhance and differentiate your Metro style app.
© 2017 SlidePlayer.com Inc. All rights reserved.