COMP9018 - Advanced Graphics Advanced Graphics: Performance.

Slides:



Advertisements
Similar presentations
Instruction Set Design
Advertisements

1 Angel: Interactive Computer Graphics 4E © Addison-Wesley 2005 Building Models modified by Ray Wisman Ed Angel Professor of Computer Science,
1 Building Models. 2 Objectives Introduce simple data structures for building polygonal models ­Vertex lists ­Edge lists OpenGL vertex arrays.
Graphics Pipeline.
OPENGL Return of the Survival Guide. Buffers (0,0) OpenGL holds the buffers in a coordinate system such that the origin is the lower left corner.
CAP4730: Computational Structures in Computer Graphics Visible Surface Determination.
Hidden Surface Removal Why make the effort?  Realistic models.  Wasted time drawing. OpenGL and HSR  OpenGL does handle HSR using the depth buffer.
 The success of GL lead to OpenGL (1992), a platform-independent API that was  Easy to use  Close enough to the hardware to get excellent performance.
Meshes Dr. Scott Schaefer. 3D Surfaces Vertex Table.
Tools for Investigating Graphics System Performance
Computer Graphics Ben-Gurion University of the Negev Fall 2012.
COMP Advanced Graphics Advanced Graphics: Part 2 Quick review of OpenGL Some more information about OpenGL Performance optimisation techniques generally.
Further Programming for 3D applications CE Introduction to Further Programming for 3D application Bob Hobbs Faculty of Computing, Engineering and.
Introduction General Data Structures - Arrays, Linked Lists - Stacks & Queues - Hash Tables & Binary Search Trees - Graphs Spatial Data Structures -Why.
27-Jun-15 Profiling code, Timing Methods. Optimization Optimization is the process of making a program as fast (or as small) as possible Here’s what the.
30-Jun-15 Profiling. Optimization Optimization is the process of making a program as fast (or as small) as possible Here’s what the experts say about.
Code Generation CS 480. Can be complex To do a good job of teaching about code generation I could easily spend ten weeks But, don’t have ten weeks, so.
University of Texas at Austin CS 378 – Game Technology Don Fussell CS 378: Computer Game Technology Beyond Meshes Spring 2012.
Hidden Surface Removal
Erdem Alpay Ala Nawaiseh. Why Shadows? Real world has shadows More control of the game’s feel  dramatic effects  spooky effects Without shadows the.
© Copyright Khronos Group, Page 1 Harnessing the Horsepower of OpenGL ES Hardware Acceleration Rob Simpson, Bitboys Oy.
Polygon Shading. Assigning color to a shape to make graphical scenes look realistic, or artistic, or whatever effect we’re attempting to achieve But first.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
GPU Programming Robert Hero Quick Overview (The Old Way) Graphics cards process Triangles Graphics cards process Triangles Quads.
CS559: Computer Graphics Lecture 33: Shape Modeling Li Zhang Spring 2008.
CSE 381 – Advanced Game Programming Basic 3D Graphics
Image Synthesis Rabie A. Ramadan, PhD 2. 2 Java OpenGL Using JOGL: Using JOGL: Wiki: You can download JOGL from.
09/09/03CS679 - Fall Copyright Univ. of Wisconsin Last Time Event management Lag Group assignment has happened, like it or not.
Speeding Up Rendering After Deciding What to Draw.
CS 450: COMPUTER GRAPHICS REVIEW: INTRODUCTION TO COMPUTER GRAPHICS – PART 2 SPRING 2015 DR. MICHAEL J. REALE.
CSE 470: Computer Graphics. 10/15/ Defining a Vertex A 2D vertex: glVertex2f(GLfloat x, GLfloat y); 2D vertexfloating pointopenGL parameter type.
1 ENERGY 211 / CME 211 Lecture 26 November 19, 2008.
CS 638, Fall 2001 Multi-Pass Rendering The pipeline takes one triangle at a time, so only local information, and pre-computed maps, are available Multi-Pass.
Real-time Graphics for VR Chapter 23. What is it about? In this part of the course we will look at how to render images given the constrains of VR: –we.
GRAPHICS PIPELINE & SHADERS SET09115 Intro to Graphics Programming.
Representation. Objectives Introduce concepts such as dimension and basis Introduce coordinate systems for representing vectors spaces and frames for.
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
Debugging and Profiling With some help from Software Carpentry resources.
Review of OpenGL Basics
CS 3500 L Performance l Code Complete 2 – Chapters 25/26 and Chapter 7 of K&P l Compare today to 44 years ago – The Burroughs B1700 – circa 1974.
MAE152 Computer Graphics for Scientists and Engineers Fall 03 Display Lists.
1 How will execution time grow with SIZE? int array[SIZE]; int sum = 0; for (int i = 0 ; i < ; ++ i) { for (int j = 0 ; j < SIZE ; ++ j) { sum +=
Advanced Computer Graphics Shadow Techniques CO2409 Computer Graphics Week 20.
Image Synthesis Rabie A. Ramadan, PhD 4. 2 Review Questions Q1: What are the two principal tasks required to create an image of a three-dimensional scene?
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Emerging Technologies for Games Deferred Rendering CO3303 Week 22.
Maths & Technologies for Games Graphics Optimisation - Batching CO3303 Week 5.
More Object Representations
Real-Time Dynamic Shadow Algorithms Evan Closson CSE 528.
OpenGL Vertex Arrays OpenGL vertex arrays store vertex properties such as coordinates, normal vectors, color values and texture coordinates. These properties.
Shadows David Luebke University of Virginia. Shadows An important visual cue, traditionally hard to do in real-time rendering Outline: –Notation –Planar.
VAR/Fence: Using NV_vertex_array_range and NV_fence Cass Everitt.
Computer Graphics I, Fall 2010 Building Models.
11/16/04© University of Wisconsin, CS559 Fall 2004 Last Time Texture Anti-Aliasing Texture boundaries Modeling introduction.
A study of efficiency INDEX BUFFERS JEFF CHASTINE 1.
OpenGL Objects Finalised. Debugging Tip For Debugging your applications remember: glGetError(); gluErrorString(); Don’t use these in release code (the.
COMP 175 | COMPUTER GRAPHICS Remco Chang1/XX13 – GLSL Lecture 13: OpenGL Shading Language (GLSL) COMP 175: Computer Graphics April 12, 2016.
CSE 351 Caches. Before we start… A lot of people confused lea and mov on the midterm Totally understandable, but it’s important to make the distinction.
Introduction to 3-D Viewing Glenn G. Chappell U. of Alaska Fairbanks CS 381 Lecture Notes Monday, October 27, 2003.
Computer Graphics (Fall 2003) COMS 4160, Lecture 5: OpenGL 1 Ravi Ramamoorthi Many slides courtesy Greg Humphreys.
Angel: Interactive Computer Graphics 5E © Addison-Wesley 2009
Computer Graphics Lecture 32
Computer Graphics Index Buffers
Swapping Segmented paging allows us to have non-contiguous allocations
Building Models Ed Angel
Display Lists & Text Glenn G. Chappell
Isaac Gang University of Mary Hardin-Baylor
Lecture 13 Clipping & Scan Conversion
Angel: Interactive Computer Graphics5E © Addison-Wesley 2009
Frame Buffer Applications
Presentation transcript:

COMP Advanced Graphics Advanced Graphics: Performance

COMP Advanced Graphics Performance Optimisation in OpenGL Some quotes on optimisation: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. ” - Donald Knuth "Rules of Optimization: Rule 1: Don't do it. Rule 2 (for experts only): Don't do it yet." - M.A. Jackson

COMP Advanced Graphics But in graphics... Frequently, performance is critical to utility/value. Working on the edge of the possible. So may have to optimise. Systems are engineered for optimisation.

COMP Advanced Graphics Making things run faster Approaches to optimisation –Use faster hardware –Right data in the right place at the right time –Getting rid of redundant calculations –Tricking the eye ("close enough is good enough") –Trading space for time –Not drawing (elimination of what wouldn't be seen anyway) –Writing it in assembly/C (but very rarely, usually last)

COMP Advanced Graphics Hardware acceleration Can be a good option. Problem: Price-performance curve is exponential

COMP Advanced Graphics More on hardware acceleration Implication: It's easy to get very good performance using hardware accel, but it gets extremely expensive when trying to obtain excellent performance. Don't forget Moore's law. Interaction between long development times & Moore's law means sometimes problem "fixes itself".

COMP Advanced Graphics Right data in the right place One of the best techniques The basis of caching Exploits “ locality ” -- likely to reuse the same information again and again Two types: –Temporal –Spatial

COMP Advanced Graphics Another way to think of OpenGL OpenGL can be thought of as a client- server architecture Some examples of client-server: –The Web –X windows When did we ever say that the client and server were on the same machine? OpenGL can run on a network

COMP Advanced Graphics Client-server concept The program that makes API calls is the client The OpenGL implementation is the server The client sends requests to the server Client and server may be different machines - e.g. client is big mainframe spewing OpenGL commands; server is a PC with hardware acceleration Still convenient to think of as client = my program, server = OS/driver/graphics card

COMP Advanced Graphics Client-server concept Client-server concept is still useful on a single machine. Intuition: Client is your program, server is your graphics card Why is it useful concept? Important from a performance point of view. Different performance if data is stored at client or server.

COMP Advanced Graphics Right place at the right time This is where the client server stuff comes in. Now have graphics cards with 512MB on board. What use is it? Once data is on the graphics card, everything is faster. Problem: Once it's on the graphics card, it can't (easily) be modified.

COMP Advanced Graphics Display lists A very simple way to speed up OpenGL. Idea: Take almost any sequence of OpenGL commands, and package them up; then you can use them like macros. Other libraries have similar concepts. e.g DX has "execute buffers".

COMP Advanced Graphics When and why Why? –Convenience: give something akin to a function calling structure but more efficient. –Efficiency: hardware can optimise, reduces function call overhead, data can live on the graphics card When? –What you want to render is unlikely to change –When you are reusing structure –When you need speed

COMP Advanced Graphics Initialisation 3 steps: Initialise, define, use. Get a display list ID (actually an int) using glGenLists(size) Can request more than one list at a time. Returns an int you can use. Return 0 if none available

COMP Advanced Graphics Definition Like glBegin() and glEnd() glNewList(index, GL_COMPILE)... code for rendering things... glEndList(); Instead of GL_COMPILE, can be GL_COMPILE_AND_EXECUTE

COMP Advanced Graphics Use To render stuff, use glCallList(index) IMPORTANT NOTES: –Almost anything can go in a display list: matrix ops, material defs, textures, geometry, lights, whatever... –Display lists COPY data: you can't modify the data once it's in a display list, even if it's a reference (i.e. e.g. if you use gl*fv(object), it won't notice when object changes). –Display lists affect and are effected by the current matrix stack values!!

COMP Advanced Graphics What CAN'T you call for a DL Some things not allowed: –Anything that asks about the current state. –Anything that changes the rendering mode. –Anything that makes or deletes a list (but calling another display list is fine - can use this to build a hierarchy)

COMP Advanced Graphics Code example Look at nodisplaylist.c vs displaylist.c Conclusion –Likely to be much faster, since data lives on graphics card. –Not much effort.

COMP Advanced Graphics Redundant calculations Also very important optimisation technique. Closely related to locality idea.

COMP Advanced Graphics glBegin(GL_QUAD); glVertex3f(x0,y0,z0);glVertex3f(x1,y1,z1); glVertex3f(x2,y2,z2); glVertex3f(x3,y3,z3); glEnd(); glBegin(GL_QUAD); glVertex3f(x1,y1,z1); glVertex3f(x5,y5,z5); glVertex3f(x6,y6,z6); glVertex3f(x2,y2,z2); glEnd(); Redundant calculations An example: Vertex arrays. Consider rendering a cube in OpenGL.

COMP Advanced Graphics Answers: 24, 8, 67 per cent Question How many points are transformed and lit in previous rendering of cube? How many points would minimally have to be transformed and lit in previous rendering? How much calculations are wasted?

COMP Advanced Graphics Huge waste! Same calculations are repeated. How to solve? Use indexed face set data structure. Consists of two lists: –A list of coordinates. –A list of polygons = a list of lists of vertex indices.

COMP Advanced Graphics Cube example float vertices[][] = {{x0,y0,z0}, {x1,y1,z1}, {x2,y2,z2},..., {x7,y7,z7}}; int faces[][] = {{0,1,2,3}, {0,5,6,2},..., {4,5,6,7}}; But what about other data, e.g. surface normals? Need to store them too.

COMP Advanced Graphics Problem: Needs API support To do this efficiently, API needs to support such an approach. Any good graphics API (e.g. OpenGL, DX8, Inventor, VRML97, etc) supports this. Have various names. In OpenGL, called a vertex array.

COMP Advanced Graphics Using Vertex Arrays Can have up to 6 different arrays, for: –Vertex coordinates –Normals –Colours –Texture coordinates –A few other funky ones: index, edge flag Enable which ever arrays you need glEnableClientState(GL_VERTEX_ARRAY)

COMP Advanced Graphics Step 2 After initialising, tell it where the data lives e.g. glVertexPointer(size, type, stride, vertices); Size is number of values per vertex (typ. 2, 3 or 4) Type = GL_FLOAT or whatever Stride is for more funky stuff (e.g. interleaved arrays) Similar calls for glNormalPointer, glTexCoordPointer etc

COMP Advanced Graphics Step 3: Access the data Lots of different ways to call. Simplest: glArrayElement(index). Action depends on what's enabled, but let's say only vertex arrays are enabled. Then this looks up index in the last thing glVertexPointer was called on (say x) and does glVertex3f(x). If normal arrays were enabled,(and normal for index was y) this would do: glNormal3f(y); glVertex3f(x); NOTE: belongs between glBegin, glEnd.

COMP Advanced Graphics Bunches of indices Can also give multiple points at once: use glDrawElements(mode, count, type, indices). Mode is GL_LINE, GL_POLYGON, etc. Count is number of indices Type is usually GL_UNSIGNED_INT NOTE: Does NOT go between a glBegin/glEnd

COMP Advanced Graphics glDrawElements Functionally equivalent to: glBegin(mode); for(i=0; i < count; i++) glArrayElement(indices[i]); glEnd(); glDrawRangeElements() is similar, but you specify a constrained range of indices.

COMP Advanced Graphics What does OpenGL do? Can cache previously transformed vertices Can use glDrawRangeElements to help tell OpenGL what's going to change glDrawElements can draw lots of objects. Example: if all polys have four vertices, then use GL_QUADS instead and can give list of 24 vertices.

COMP Advanced Graphics Vertex Buffer Objects “ Right stuff at right time ” Problem: Vertex arrays are client side. How to speed up? Put vertex array on server side? What is the disadvantage?

COMP Advanced Graphics OpenGL 2.0 Vertex buffer objects – very new. Idea: Push vertex array to server Simple to use: –glGenBuffers() to ask for a buffer –glBindBuffer() to make it the current context –glBufferData() specifies the data –Then use the usual commands

COMP Advanced Graphics glBufferData glBufferData(target, size, data, usage) Most are obvious, but usage? –Used to control how buffer gets treated –Static vs stream vs dynamic –Read vs Copy vs Draw Can also use glMapData to modify the data

COMP Advanced Graphics Code Example vertexarray.c Note: can mix and match normal with vertex arrays.

COMP Advanced Graphics Practical implications You CAN use display lists and vertex arrays at the same time, but it's a bit tricky. When you change data in a vertex array, and render immediately, that's fine. But with a display list, the data is copied. Example: Say you have a creature with constantly moving body. Can't use a a display list. But can use, for say, a helmet; or a head.

COMP Advanced Graphics Space-time tradeoff Sometimes, can use more space to make algorithm faster or vice versa. E.g. can sometimes precompute values if they will be reused alot. Trading space for time example: precomputing sin/cos tables. Trading time for space example: compressed textures (but really still about time).

COMP Advanced Graphics Tricking the eye Lots of examples in what you've already studied. E.g. Gouraud shading is nonsense theoretically. Strictly Gouraud shading should be perspective- corrected. Not noticeable for Gouraud, but IS noticeable for texture maps.

COMP Advanced Graphics Not rendering things Back face culling: not drawing polygons facing away from us. Easy to enable in OpenGL: glEnable(GL_CULL_FACE) But lots of other examples: e.g. using visibility trees (similar to BSP trees) and portal systems to cut back on polygons. Any coincidence games are indoors? (more later) Also the multires stuff and LOD (more later)

COMP Advanced Graphics Rewriting code Usually the last resort. Usually the big gains are in algorithmic improvement, not rewriting code more efficiently or re-implementing in C/Assembly. Assembly less significant with RISC processors. Very time consuming both initially and long- term.

COMP Advanced Graphics Profiling Profiling is analysing software as it runs to see how much time executing different parts of code. General observation: 90 per cent of time spent executing 10 per cent of code. Pointless optimising wrong thing. Example: Say you improve code outside top 10 per cent by 100 per cent. Will only make program run 5 per cent faster.

COMP Advanced Graphics Bottlenecks Profiling frequently reveals the bottleneck (the thing that slows everything down). Type of bottleneck suggest solution. Typical bottlenecks: –Fill-limited: Rasterising/texturing polygons. Occurs with software renderers. –Geometry-limited: Calculations of geometry. Too many polygons/vertices. –Client-side limited: Calculations on client side (e.g. of vertex/texture coordinates). Code optimization? Maybe

COMP Advanced Graphics Demo Profiling