Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gameworks.nvidia.com | GDC 2015 Nathan Reed — Developer Technology Engineer, NVIDIA Dean Beeler — Software Engineer, Oculus VR Direct: How NVIDIA Technology.

Similar presentations


Presentation on theme: "Gameworks.nvidia.com | GDC 2015 Nathan Reed — Developer Technology Engineer, NVIDIA Dean Beeler — Software Engineer, Oculus VR Direct: How NVIDIA Technology."— Presentation transcript:

1 gameworks.nvidia.com | GDC 2015 Nathan Reed — Developer Technology Engineer, NVIDIA Dean Beeler — Software Engineer, Oculus VR Direct: How NVIDIA Technology Is Improving the VR Experience

2 gameworks.nvidia.com | GDC 2015 Nathan Reed NVIDIA DevTech — 2 yrs Previously: game graphics programmer at Sucker Punch Dean Beeler Oculus — 2 yrs Previously: emulation, drivers, mobile dev, kernel Who We Are

3 gameworks.nvidia.com | GDC 2015 Headset design Input Rendering performance Experience design Hard Problems of VR

4 gameworks.nvidia.com | GDC 2015 Latency Scott W. Vincent Motion to photons in ≤ 20 ms Franklin Heijnen

5 gameworks.nvidia.com | GDC 2015 Stereo Rendering Two eyes, same scene

6 gameworks.nvidia.com | GDC 2015 Various NV hardware & software technologies Targeted at VR rendering performance Reduce latency Accelerate stereo rendering What Is VR Direct?

7 gameworks.nvidia.com | GDC 2015 VR Direct Components In This Talk Asynchronous Timewarp VR SLI

8 gameworks.nvidia.com | GDC 2015 Latency Frame Queuing Timewarp Late-Latching Constants Asynchronous Timewarp

9 gameworks.nvidia.com | GDC 2015 Frame Queuing CPU queue GPU Scanout Time Frame NFrame N+1… Frame NFrame N+1Frame N−1… Frame N…Frame N−1Frame N+1 Frame N…Frame N−1 Frame N…Frame N−1 Frame N+1 …

10 gameworks.nvidia.com | GDC 2015 Frame Queuing CPU GPU Scanout Time Frame NFrame N+1… Frame NFrame N−1 Frame N…Frame N−1 Frame N+1

11 gameworks.nvidia.com | GDC 2015 Timewarp

12 gameworks.nvidia.com | GDC 2015 Very effective at reducing latency...of rotation! Fortunately, that’s the most important Doesn’t help translation! Doesn’t help other input latency Doesn’t help if vsync is missed Timewarp Pros & Cons

13 gameworks.nvidia.com | GDC 2015 Timewarp Pipeline Bubbles CPU GPU Scanout Time Frame N Frame N−1 Timewarp Wait Idle Frame N+1 Vsync

14 gameworks.nvidia.com | GDC 2015 Late-Latching Constants CPU GPU Scanout Time Frame N … Frame N−1 Timewarp Frame N+1Wait Frame N+1 Vsync …

15 gameworks.nvidia.com | GDC 2015 Update constants after render commands queued NO_OVERWRITE / persistently-mapped buffer GPU sees latest data when it renders Still doesn’t help with missed vsync Late-Latching Constants

16 gameworks.nvidia.com | GDC 2015 Asynchronous Timewarp CPU GPU Scanout Timewarp Time Frame N … Frame N−1 Frame N+1 … Vsync …

17 gameworks.nvidia.com | GDC 2015 GPU Space Vs Time Time GPU Resources (Space)

18 gameworks.nvidia.com | GDC 2015 Main Rendering Space-Multiplexing Timewarp Time GPU Resources (Space) VsyncVsync

19 gameworks.nvidia.com | GDC 2015 Time-Multiplexing Main Rendering Time GPU Resources (Space) Vsync Timewarp Vsync

20 gameworks.nvidia.com | GDC 2015 Prevents worst case: stuck image on headset Patches up occasional stutters Doesn’t help translation Doesn’t help other input latency Doesn’t help animation stuttering due to low FPS Async Timewarp Pros & Cons

21 gameworks.nvidia.com | GDC 2015 NV driver supports high-priority graphics context Time-multiplexed — takes over entire GPU Main rendering → normal context Timewarp rendering → high-pri context High-Priority Context

22 gameworks.nvidia.com | GDC 2015 Async Timewarp With High-Pri Context Render thread GPU Warp thread Time Frame N …Frame N+1 … Vsync Preempt Vsync Preempt

23 gameworks.nvidia.com | GDC 2015 Fermi, Kepler, Maxwell: draw-level preemption Can only switch at draw call boundaries! Long draw will delay context switch Future GPU: finer-grained preemption Preemption

24 gameworks.nvidia.com | GDC 2015 NvAPI_D3D1x_HintCreateLowLatencyDevice() Applies to next D3D device created Fermi, Kepler, Maxwell / Windows Vista+ NDA developer driver available now Direct3D High-Priority Context

25 gameworks.nvidia.com | GDC 2015 EGL_IMG_context_priority Adds priority attribute to eglCreateContext Available on Tegra K1, X1 Including SHIELD console Only for EGL (Android) at present WGL (Windows), GLX (Linux) to come OpenGL High-Priority Context

26 gameworks.nvidia.com | GDC 2015 Still try to render at headset native framerate! Async timewarp is a safety net Hide occasional hitches / perf drops Not for upsampling framerate Developer Guidance

27 gameworks.nvidia.com | GDC 2015 Avoid long draw calls Current GPUs only preempt at draw call boundaries Async timewarp can get stuck behind long draws Split up draws that take >1 ms or so E.g. heavy postprocessing Split into screen-space tiles Developer Guidance

28 gameworks.nvidia.com | GDC 2015 Translation warping Using depth buffer, layered images, etc. Motion extrapolation Using velocity buffer GSYNC Tricky with low-persistence display Future Work

29 gameworks.nvidia.com | GDC 2015 Reduce queued frames to 1 Timewarp: adjusts rendered image for late head rotation Async timewarp: safety net for missed vsync NVIDIA enables async timewarp via high-pri context Latency TL;DR

30 gameworks.nvidia.com | GDC 2015 Stereo Rendering Multiview Rendering VR SLI

31 gameworks.nvidia.com | GDC 2015 Frame Pipeline Which stages must be done twice for stereo? Find visible objects Submit render commands Driver internal work CPU Transform geometry Rasterization Shading GPU

32 gameworks.nvidia.com | GDC 2015 Flexibility vs Optimizability More flexible — all stages separate Left Right

33 gameworks.nvidia.com | GDC 2015 Flexibility vs Optimizability More optimizable — some stages shared Left Right Shared

34 gameworks.nvidia.com | GDC 2015 Almost the same visible objects Almost the same render commands Almost the same driver internal work Almost the same geometry rendered Stereo Views

35 gameworks.nvidia.com | GDC 2015 Cubemaps: 6 faces Shadow maps Several lights in one scene Slices of a cascaded shadow map Light probes for GI Many probe positions in one scene Other Multi-View Scenarios

36 gameworks.nvidia.com | GDC 2015 Submit scene render commands once All draws, states, etc. broadcast to all views API support for limited per-view state Saves CPU rendering cost Maybe GPU too — depending on impl! Multiview Rendering

37 gameworks.nvidia.com | GDC 2015 Shader Multiview API VS Tess & GS VS Tess & GS RastPS RastPS ViewID = 0 ViewID = 1

38 gameworks.nvidia.com | GDC 2015 Hardware Multiview APIVS Tess & GS RastPS RastPS ViewMatrix[0] ViewMatrix[1]

39 gameworks.nvidia.com | GDC 2015 Shading Reuse APIVS Tess & GS RastPS RastPS Share work

40 gameworks.nvidia.com | GDC 2015 VR SLI API Left Right Shared command stream

41 gameworks.nvidia.com | GDC 2015 Interlude: AFR SLI CPU GPU0 GPU1 Scanout Time … NN−2 N+1N−1 N+2 N+3 … NN+1N+2… … NN+1N+2N−1…

42 gameworks.nvidia.com | GDC 2015 VR SLI CPU GPU0 GPU1 Scanout … N leftN−2 L NN+1N+2… NN+1N+2N−1… N+1 L… N rightN−2 RN+1 R… Time

43 gameworks.nvidia.com | GDC 2015 GPU 1 MemoryGPU 0 Memory VR SLI Same resources & commands

44 gameworks.nvidia.com | GDC 2015 VR SLI APIEngine Per-GPU state: Constant buffers Viewports

45 gameworks.nvidia.com | GDC 2015 VR SLI Blit GPU1 → GPU0 over PCIe bus

46 gameworks.nvidia.com | GDC 2015 View-independent work (e.g. shadow maps) is duplicated Scaling depends on proportion of view-dependent work VR SLI Scaling

47 gameworks.nvidia.com | GDC 2015 Blitting between GPUs uses PCIe bus PCIe 2.0 x16: ~8 GB/sec = ~1 ms / eye view PCIe 3.0 x16: ~16 GB/sec = ~0.5 ms / eye view Dedicated copy engine Non-dependent rendering can continue during blit Cross-GPU Blit

48 gameworks.nvidia.com | GDC 2015 Distortion vs SLI Distortion before or after cross-GPU blit? After Lower latency Future-compatible with Oculus SDK updates Before Distortion uses both GPUs 40% less data to transfer

49 gameworks.nvidia.com | GDC 2015 Currently D3D11 only Fermi, Kepler, Maxwell / Windows 7+ Developer driver available now OpenGL and other APIs: to come API Availability

50 gameworks.nvidia.com | GDC 2015 Teach your engine the concept of a “multiview set” Related views that will be rendered together Currently: for (each view) find_objects(); for (each object) update_constants(); render(); Developer Guidance

51 gameworks.nvidia.com | GDC 2015 Multiview: find_objects(); for (each object) for (each view) update_constants(); render(); Developer Guidance

52 gameworks.nvidia.com | GDC 2015 Keep track of which render targets store stereo data May need to be marked or set up specially Or allocated as a texture array, etc. Keep track of sync points Where you need all views finished before continuing May need to blit between GPUs Developer Guidance

53 gameworks.nvidia.com | GDC 2015 Multiview: submit scene once, save CPU overhead Requires some engine integration Range of possible implementations Trade off flexibility vs optimizability VR SLI: a GPU per eye Stereo Rendering TL;DR

54 gameworks.nvidia.com | GDC 2015 Variety of VR-related APIs coming in near future Reduce latency Reduced frame queuing Enable async timewarp & other improvements Accelerate stereo rendering Multiview APIs VR SLI VR Direct Recap

55 gameworks.nvidia.com | GDC 2015 Fermi, Kepler, Maxwell D3D11: context priorities and VR SLI NDA developer driver available now Android: EGL_IMG_context_priority Other APIs/platforms: to come VR Direct API Availability

56 gameworks.nvidia.com | GDC 2015 All this stuff is hot out of the oven! Will need more iterations before it settles See what works, revise APIs as needed Consolidate & standardize across industry What Next?

57 gameworks.nvidia.com | GDC us: Slides will be posted: https://developer.nvidia.com/gdc-2015 Questions & Comments?


Download ppt "Gameworks.nvidia.com | GDC 2015 Nathan Reed — Developer Technology Engineer, NVIDIA Dean Beeler — Software Engineer, Oculus VR Direct: How NVIDIA Technology."

Similar presentations


Ads by Google