Async Workgroup Update Barthold Lichtenbelt. OpenGL Siggraph BOF 2006 - page 2 Goals Provide synchronization framework for OpenGL - Provide base functionality.

Slides:



Advertisements
Similar presentations
Vertex Buffer Objects, Vertex Array Objects, Pixel Buffer Objects.
Advertisements

3.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Process An operating system executes a variety of programs: Batch system.
More on Processes Chapter 3. Process image _the physical representation of a process in the OS _an address space consisting of code, data and stack segments.
Resource management and Synchronization Akos Ledeczi EECE 354, Fall 2010 Vanderbilt University.
Async Programming WITH ASYNC TASK
 The success of GL lead to OpenGL (1992), a platform-independent API that was  Easy to use  Close enough to the hardware to get excellent performance.
The Process Model.
OS2-1 Chapter 2 Computer System Structures. OS2-2 Outlines Computer System Operation I/O Structure Storage Structure Storage Hierarchy Hardware Protection.
5.6 Semaphores Semaphores –Software construct that can be used to enforce mutual exclusion –Contains a protected variable Can be accessed only via wait.
5.6.2 Thread Synchronization with Semaphores Semaphores can be used to notify other threads that events have occurred –Producer-consumer relationship Producer.
Concurrency: Mutual Exclusion, Synchronization, Deadlock, and Starvation in Representative Operating Systems.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
Processes 1 CS502 Spring 2006 Processes Week 2 – CS 502.
Chapter 2.3 : Interprocess Communication
© 2004, D. J. Foreman 2-1 Concurrency, Processes and Threads.
Chapter 4: Threads Adapted to COP4610 by Robert van Engelen.
The University of New Hampshire InterOperability Laboratory Serial ATA (SATA) Protocol Chapter 10 – Transport Layer.
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 11 Case Study 2: Windows Vista Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
Chapter 4: Threads. 4.2CSCI 380 Operating Systems Chapter 4: Threads Overview Multithreading Models Threading Issues Pthreads Windows XP Threads Linux.
1 Lecture 4: Threads Operating System Fall Contents Overview: Processes & Threads Benefits of Threads Thread State and Operations User Thread.
Discussion Week 8 TA: Kyle Dewey. Overview Exams Interrupt priority Direct memory access (DMA) Different kinds of I/O calls Caching What I/O looks like.
Next-Generation Graphics APIs: Similarities and Differences Tim Foley NVIDIA Corporation
OpenGL Performance John Spitzer. 2 OpenGL Performance John Spitzer Manager, OpenGL Applications Engineering
OpenGL-ES 3.0 And Beyond Boston What is EGL? EGL handles: –provides available surface specifications –context management –surface binding –render.
UNIX Files File organization and a few primitives.
© Copyright 3Dlabs 2004 Page 1 Super Buffers Workgroup Status update, December 2004.
OpenGL Buffer Transfers Patrick Cozzi University of Pennsylvania CIS Spring 2012.
Time Management.  Time management is concerned with OS facilities and services which measure real time, and is essential to the operation of timesharing.
Source: Operating System Concepts by Silberschatz, Galvin and Gagne.
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 25, 2011 Synchronization.ppt Synchronization These notes will introduce: Ways to achieve.
Lecture 8 Page 1 CS 111 Online Other Important Synchronization Primitives Semaphores Mutexes Monitors.
CSE 451: Operating Systems Winter 2015 Module 5 1 / 2 User-Level Threads & Scheduler Activations Mark Zbikowski 476 Allen Center.
RTX - 51 Objectives  Resources needed  Architecture  Components of RTX-51 - Task - Memory pools - Mail box - Signals.
NVIDIA OpenGL Update Simon Green. Copyright © NVIDIA Corporation 2004 Overview SLI How it works OpenGL Programming Tips SLI Futures New extensions NVX_instanced_arrays.
Discussion Week 2 TA: Kyle Dewey. Overview Concurrency Process level Thread level MIPS - switch.s Project #1.
Multithreaded Programing. Outline Overview of threads Threads Multithreaded Models  Many-to-One  One-to-One  Many-to-Many Thread Libraries  Pthread.
Chapter 4: Threads. 4.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th edition, Jan 23, 2005 Chapter 4: Threads Overview Multithreading.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Unix System Calls and Posix Threads.
Processes, Threads, and Process States. Programs and Processes  Program: an executable file (before/after compilation)  Process: an instance of a program.
1 Lecture 1: Computer System Structures We go over the aspects of computer architecture relevant to OS design  overview  input and output (I/O) organization.
Synchronization These notes introduce:
Proposal for a Proximity-2 Protocol Ed Greenberg Greg Kazz May /11/20161.
Lecture 6 Page 1 CS 111 Summer 2013 Concurrency Solutions and Deadlock CS 111 Operating Systems Peter Reiher.
VAR/Fence: Using NV_vertex_array_range and NV_fence Cass Everitt.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
CISC2200 Threads Fall 09. Process  We learn the concept of process  A program in execution  A process owns some resources  A process executes a program.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 5: Threads  Overview  Multithreading Models  Threading Issues  Pthreads.
1 ITCS 4/5145 Parallel Programming, B. Wilkinson, Nov 12, CUDASynchronization.ppt Synchronization These notes introduce: Ways to achieve thread synchronization.
Threads prepared and instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University 1July 2016Processes.
Chapter 4 – Thread Concepts
OPERATING SYSTEM CONCEPT AND PRACTISE
GPU Computing CIS-543 Lecture 10: Streams and Events
Outline Other synchronization primitives
Chapter 4 – Thread Concepts
Async or Parallel? No they aren’t the same thing!
Other Important Synchronization Primitives
Tarek Abdelzaher Vikram Adve Marco Caccamo
CS703 - Advanced Operating Systems
Multithreading.
Threads Chapter 5 2/17/2019 B.Ramamurthy.
Hold up, wait a minute, let me put some async in it
Threads Chapter 5 2/23/2019 B.Ramamurthy.
Concurrency, Processes and Threads
Still Chapter 2 (Based on Silberchatz’s text and Nachos Roadmap.)
CSE 153 Design of Operating Systems Winter 19
January 15, 2004 Adrienne Noble
9. Threads SE2811 Software Component Design
Chapter 4: Threads.
Synchronization These notes introduce:
Chapter 13: I/O Systems.
Presentation transcript:

Async Workgroup Update Barthold Lichtenbelt

OpenGL Siggraph BOF page 2 Goals Provide synchronization framework for OpenGL - Provide base functionality as defined in NV_fence and GL2_async_core - Build a framework for future, more complex, functionality, some of which discussed in GL2_async_core - Initially support CPU GPU synchronization - Support synchronization across multiple OpenGL contexts Resulted in GL_ARB_sync spec - Finished April Posted draft to opengl.org for feedback - Not quite official ARB extension yet

OpenGL Siggraph BOF page 3 Functionality overview ARB_sync provides synchronization primitives - Can be tested, set and waited upon Specifically, a “Fence Synchronization Object” and corresponding Fence command Fence completion allows for partial glFinish - All commands prior to the fence are forced to complete before control is returned to caller Fence Sync Objects can be shared across contexts - Allows for synchronization of OpenGL command streams across contexts New data type: GLtime represents intervals in nanoseconds - 64 bit integer, same encoding as UST counter in OpenML - Accuracy implementation dependent, precision in nanoseconds If you have used the Windows Event model, this will feel familiar

OpenGL Siggraph BOF page 4 Synchronization model in ARB_sync 1/2 A “sync object” is a primitive used for synchronization between CPU and GPU, CPU, or ‘something else’. - Sync object has state: type, condition, status A sync object’s status can be signaled or non-signaled - when created status is signaled unless a flag is set in which case it is non-signaled A “fence sync object” is a specific type of sync object - Provides partial finish semantics - Only type of sync object currently defined A “fence” is a token inserted in the GL command stream - A sync object is not inserted into the command stream - Fence has no state A fence is associated with a fence sync object. - Multiple fences can be associated with the same sync object When a fence is inserted in the command stream, the status of its sync object is set to non-signaled A fence, once completed, will set the status of its sync object to signaled

OpenGL Siggraph BOF page 5 Synchronization model in ARB_sync 2/2 A wait function waits on a sync object, not on a fence A poll function polls a sync object, not a fence A wait function called on a sync object in the non-signaled state will block. It unblocks when the sync object transitions to the signaled state.

OpenGL Siggraph BOF page 6 Example – RTT with two contexts Context A Sync_objectA = glCreateSync(attrib); glFence(sync_objectA); glFlush(); // prevent deadlock Context B glClientWaitSync(sync_objectA,0,GL_FOREVER); glBindTexture(….); // Just rendered

OpenGL Siggraph BOF page 7 OS specific functionality Convert sync object to the window system native event primitive - Allows applications to synchronize all events in a system using one API All operations on are reflected in OS event and vice-versa Both and the OS event are valid to use in your code On windows, convert to an Event HANDLE wglConvertSyncToEvent(object sync); - Need to specify, when sync object is created, that it can be converted to OS event - Separate extension: WGL_ARB_sync_event On Unix, convert to a file-descriptor, x-event or semaphore? - Still TBD

OpenGL Siggraph BOF page 8 Possible future functionality Add a WaitForMultipleSync(uint *sync_objects, ….) command - Synchronize with multiple sync objects at once Add a “payload” to a fence - For example, the time it completed Allow one GPU stream to wait for another GPU stream - WaitSync(sync_object); A sync object whose status will pulse with every vblank A sync object that can signal when data binding has completed - As opposed to when rendering has completed using the data

OpenGL Siggraph BOF page 9 Example – Streaming video processing Loop Draw frame 1 // To a FBO, for example glFence(sync_object1); // inserts a fence in the command stream Draw frame 2 glFence(sync_object2); while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) // App uses CPU cycles instead of blocking Read back data in frame 1 while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) // App uses CPU cycles instead of blocking Read back data in frame 2

OpenGL Siggraph BOF page 10 Variation with asynchronous read back Loop Draw frame 1 // To a FBO, for example Read back frame 1 into PBO 1// Asynchronous readback glFence(sync_object1); // Inserts a fence in the command stream Draw frame 2 Read back frame 2 into PBO 2 glFence(sync_object2); while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) // App uses CPU cycles instead of blocking glMapBuffer(…); // Access the data of frame 1 in PBO 1 while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) // App uses CPU cycles instead of blocking glMapBuffer(…); // Access the data of frame 2 in PBO 2

OpenGL Siggraph BOF page 11 Differences with GL_NV_Fence No separation of sync objects and fences in NV_Fence - NV version only has fence objects - Fence object has state Creation of sync object and inserting a fence in one command - SetFenceNV creates and inserts a fence (old object model) NV Fence objects not shared across contexts

OpenGL Siggraph BOF page 12 API Overview 1/2 Create a sync attribute object object CreateSyncAttrib(); - SYNC_TYPE has to be FENCE - SYNC_CONDITION has to be SYNC_PRIOR_COMMANDS_COMPLETE - SYNC_STATUS SIGNALED or UNSIGNALED Create the sync object object CreateSync(object attrib); Insert a fence, associated with a sync object, into command stream void Fence(object sync);

OpenGL Siggraph BOF page 13 API Overview 2/2 Wait or test the status of a fence sync object enum ClientWaitSync(object sync, uint flags, time timeout); - Blocks until sync is signalled or timeout expired - If timeout == 0, does not block, returns the status of sync - If timeout == FOREVER, call does not timeout - Optionally will flush before blocking - Returns 3 values: ALREADY_SIGNALED, TIMEOUT_EXPIRED, CONDITION_SATISFIED Signal or unsignal a sync object void SignalSync(object sync, enum mode); - If status transitions from unsignaled to signaled, ClientWaitSync will unblock

OpenGL Siggraph BOF page 14 Example – Streaming video processing Loop Draw frame 1 // To a FBO, for example glFence(sync_object1); // inserts a fence in the command stream Draw frame 2 glFence(sync_object2); while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) // App uses CPU cycles instead of blocking Read back data in frame 1 while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) // App uses CPU cycles instead of blocking Read back data in frame 2

OpenGL Siggraph BOF page 15 Variation with asynchronous read back Loop Draw frame 1 // To a FBO, for example Read back frame 1 into PBO 1// Asynchronous readback glFence(sync_object1); // Inserts a fence in the command stream Draw frame 2 Read back frame 2 into PBO 2 glFence(sync_object2); while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) // App uses CPU cycles instead of blocking glMapBuffer(…); // Access the data of frame 1 in PBO 1 while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) // App uses CPU cycles instead of blocking glMapBuffer(…); // Access the data of frame 2 in PBO 2