Fu-Chung Huang Ravi Ramamoorthi University of California, Berkeley

Slides:

Advertisements

Similar presentations

All-Frequency PRT for Glossy Objects Xinguo Liu, Peter-Pike Sloan, Heung-Yeung Shum, John Snyder Microsoft.

Advertisements

Real-time Shading with Filtered Importance Sampling

Object Specific Compressed Sensing by minimizing a weighted L2-norm A. Mahalanobis.

Fast Algorithms For Hierarchical Range Histogram Constructions

Computer graphics & visualization Global Illumination Effects.

Spherical Convolution in Computer Graphics and Vision Ravi Ramamoorthi Columbia Vision and Graphics Center Columbia University SIAM Imaging Science Conference:

FTP Biostatistics II Model parameter estimations: Confronting models with measurements.

Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.

More MR Fingerprinting

Rendering with Environment Maps Jaroslav Křivánek, KSVI, MFF UK

Paper Presentation - An Efficient GPU-based Approach for Interactive Global Illumination- Rui Wang, Rui Wang, Kun Zhou, Minghao Pan, Hujun Bao Presenter.

Precomputed Local Radiance Transfer for Real-time Lighting Design Anders Wang Kristensen Tomas Akenine-Moller Henrik Wann Jensen SIGGRAPH ‘05 Presented.

Real-Time Rendering Paper Presentation Imperfect Shadow Maps for Efficient Computation of Indirect Illumination T. Ritschel T. Grosch M. H. Kim H.-P. Seidel.

Master Thesis Lighting and materials for real-time game engines

Advanced Computer Graphics (Fall 2010) CS 283, Lecture 18: Precomputation-Based Real-Time Rendering Ravi Ramamoorthi

Final Gathering on GPU Toshiya Hachisuka University of Tokyo Introduction Producing global illumination image without any noise.

Exploiting Temporal Coherence for Incremental All-Frequency Relighting Ryan OverbeckRavi Ramamoorthi Aner Ben-ArtziEitan Grinspun Columbia University Ng.

Precomputed Radiance Transfer Harrison McKenzie Chapter.

Matrix Row-Column Sampling for the Many-Light Problem Miloš Hašan (Cornell University) Fabio Pellacini (Dartmouth College) Kavita Bala (Cornell University)

A Theory of Locally Low Dimensional Light Transport Dhruv Mahajan (Columbia University) Ira Kemelmacher-Shlizerman (Weizmann Institute) Ravi Ramamoorthi.

Pre-computed Radiance Transfer Jaroslav Křivánek, KSVI, MFF UK

PG 2011 Pacific Graphics 2011 The 19th Pacific Conference on Computer Graphics and Applications (Pacific Graphics 2011) will be held on September 21 to.

1/45 A Fast Rendering Method for Clouds Illuminated by Lightning Taking into Account Multiple Scattering Yoshinori Dobashi (Hokkaido University) Yoshihiro.

Rendering Adaptive Resolution Data Models Daniel Bolan Abstract For the past several years, a model for large datasets has been developed and extended.

EE369C Final Project: Accelerated Flip Angle Sequences Jan 9, 2012 Jason Su.

1 Rendering Geometry with Relief Textures L.Baboud X.Décoret ARTIS-GRAVIR/IMAG-INRIA.

Real-Time Relighting Digital Image Synthesis Yung-Yu Chuang 1/10/2008 with slides by Ravi Ramamoorthi, Robin Green and Milos Hasan.

View-Dependent Precomputed Light Transport Using Nonlinear Gaussian Function Approximations Paul Green 1 Jan Kautz 1 Wojciech Matusik 2 Frédo Durand 1.

Time series Model assessment. Tourist arrivals to NZ Period is quarterly.

All-Frequency Shadows Using Non-linear Wavelet Lighting Approximation Ren Ng Stanford Ravi Ramamoorthi Columbia SIGGRAPH 2003 Pat Hanrahan Stanford.

1 Implicit Visibility and Antiradiance for Interactive Global Illumination Carsten Dachsbacher 1, Marc Stamminger 2, George Drettakis 1, Frédo Durand 3.

Characteristic Point Maps Hongzhi Wu Julie Dorsey Holly Rushmeier (presented by Patrick Paczkowski) Computer Graphics Lab Yale University.

Quick survey about PRT Valentin JANIAUT KAIST (Korea Advanced Institute of Science and Technology)

Efficient Streaming of 3D Scenes with Complex Geometry and Complex Lighting Romain Pacanowski and M. Raynaud X. Granier P. Reuter C. Schlick P. Poulin.

1 Temporal Radiance Caching P. Gautron K. Bouatouch S. Pattanaik.

Pure Path Tracing: the Good and the Bad Path tracing concentrates on important paths only –Those that hit the eye –Those from bright emitters/reflectors.

Precomputed Radiance Transfer Field for Rendering Interreflections in Dynamic Scenes Minhao Pan, Rui Wang, Xinguo Liu, Qunsheng Peng and Hujun Bao State.

Multiplication of Common Fractions © Math As A Second Language All Rights Reserved next #6 Taking the Fear out of Math 1 3 ×1 3 Applying.

Non-Linear Kernel-Based Precomputed Light Transport Paul Green MIT Jan Kautz MIT Wojciech Matusik MIT Frédo Durand MIT Henrik Wann Jensen UCSD.

Real-Time Dynamic Shadow Algorithms Evan Closson CSE 528.

Radiance Cache Splatting: A GPU-Friendly Global Illumination Algorithm P. Gautron J. Křivánek K. Bouatouch S. Pattanaik.

All-Frequency Shadows Using Non-linear Wavelet Lighting Approximation Ren Ng Stanford Ravi Ramamoorthi Columbia Pat Hanrahan Stanford.

The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.

Toward Real-Time Global Illumination. Global Illumination == Offline? Ray Tracing and Radiosity are inherently slow. Speedup possible by: –Brute-force:

Toward Real-Time Global Illumination. Project Ideas Distributed ray tracing Extension of the radiosity assignment Translucency (subsurface scattering)

A novel approach to visualizing dark matter simulations

Model Optimization Wed Nov 16th 2016 Garrett Morrison.

RoboVis Alistair Wick.

Clustering Data Streams

Real-Time Soft Shadows with Adaptive Light Source Sampling

Gradient Domain High Dynamic Range Compression

Sampling and Reconstruction of Visual Appearance

Wavelets : Introduction and Examples

T. Chernyakova, A. Aberdam, E. Bar-Ilan, Y. C. Eldar

Face Detection Viola-Jones Part 2

Flicker Free Animation Using Vray

Localizing the Delaunay Triangulation and its Parallel Implementation

Image Based Modeling and Rendering (PI: Malik)

Sparse and Redundant Representations and Their Applications in

Incremental Instant Radiosity for Real-Time Indirect Illumination

Sudocodes Fast measurement and reconstruction of sparse signals

Sparse and Redundant Representations and Their Applications in

Efficient Importance Sampling Techniques for the Photon Map

The Alpha-Beta Procedure

Progressive Photon Mapping Toshiya Hachisuka Henrik Wann Jensen

Gradient Domain High Dynamic Range Compression

Principles of the Global Positioning System Lecture 11

ECE 352 Digital System Fundamentals

Scalable light field coding using weighted binary images

Real-time Global Illumination with precomputed probe

Presentation transcript:

Fu-Chung Huang Ravi Ramamoorthi University of California, Berkeley Sparsely Precomputing the Light Transport Matrix for Real Time Rendering Fu-Chung Huang Ravi Ramamoorthi University of California, Berkeley (Hi, Good morning everyone, thanks for coming to my talk this early ) My paper is: Sparsely ….. I’m Fu-Chung from UC Berkeley

Precomputed Radiance Transfer (PRT) is a new recent development for real time rendering. Natural environment lighting, intricate shadows, and Glossy materials by pre-computation. The method makes possible rapid prototyping of lighting design, for games and movies. Although we can achieve interactive rate rendering, the time spent in pre-computation is very long. (click) Like in this case, the pre-computation takes 13 hours, and the time becomes a bottleneck prevents people using it. Our goal in this paper is to accelerate the pre-computation. So we want to define the problem first. [Sloan et al. 02] [Ng et al. 03, 04] [Liu et al. 04] [Wang et al. 04] Precomputation Time: 13 hours 22 mins

The Problem Reflectance equation Separable Dynamic lighting Ng et al. 04 Reflectance equation Separable Dynamic lighting Static geometry Lighting Visibility BRDF So to render this nice scene, we have to consider the reflectance equation If we want to render the red dot B(x), whose value is given by the integral of (surrounding L) (the V in all direction) and (BRDF) ----- (click) The key insight here is that light is dynamically change, and is separable from the rest in the equation. (click) So we can combine the information only relevant to the static scene, by pre-computing the V and BRDF into (click) the transport function T. Then in real time, we just need to multiply the lighting with the transport function. (click) Notice that by the nature of precomputation, we are also limited to static scene. Beside this, what is the problem of pre-computing the transport function T ? Limited to static scene

Storage and Time Matrix Compression 50K vert. (rows) 24K directions (cols) 1.2B Rays Compression 10x to 50x Precomputation time is not reduced! Clustered PCA Assume in a typical scene, we have 50k vertices, Each with 24k angular direction to sample Then we have 1.2 B rays to samples, and 1.2B cell to store. (click) Then the immediate problem is how to compress the data (click) For example, people use wavelet transformation To reduce the coeffs from several thousands to a few hundreds, And they can be further quantized to 6-8bits. --- (click) On the other direction, with the *spatial coherences*, We can cluster vertices into smaller group, and compress them with very few PCA bases. (click) These combined give 10-50x compression rate, However, these requires additional time to compute? (click) and notice that: Even we reduce the storage, pre-computation time is not reduced actually! So the Q is CAN we find some coherence at sampling stage to reduce sampling time? Wavelet + Quantization

Precomputation Time Buddha Scene 5x speed up in precomputation The answer is yes… We can render with the same quality, with reduced pre-computation time! In this example, we can sample the scene with 2 hour and 30m Achieving 5x speed up in precomputation. Sparse Sampling Full Sampling 2 hours 36 mins 13 hours 22 mins

Precomputation Time Bunny Scene 4x speed up precomputation Here is another example with bunny model, again we achieve around 4x speedup in precomputation So before we start our algorithm, I want to first review some important literature, Sparse Sampling Full Sampling 3 hours 25mins 13 hours 8 mins

Outline Related Work Motivation / Introduction Algorithm Results Conclusion / Future Work

Related Work Precomputation based rendering [Nimeroff et al. 94] [Dorsey et al. 95] Focus only on added functionality [Wang et al. 04,06] [Sloan et al. 02,03] [Ng et al. 03,04] The idea of PRT can be traced back to 15 years ago, and is formally introduced by Sloal et al. They use SH to achieve real-time Low-Freq shadow, and further they propose CPCA to compress the data (click) Ng: use non-linear Wavelet APPROX to achieve all-frequency shadow, and later introduce the notion of triple product integral. (click) Want: show that glossy material is also possible for all frequency shadow PRT Notice that these are just *representative seminal paperS*. There are still a lot of work presented in SIGGRAPH/EGSR these years ---------------- OK, since these are nice work, then what is the problem with them? (click) well, these methods keep adding new features for real-time rendering, But the sampling time is still not reduced Low frequency shadows CPCA compression High frequency shadows Triple product integral Glossy materials

Related Work Compressive sensing Not applicable to PRT [Candes et al. 06] [Candes and Tao 06] Sampling Rate = k logN for k-sparse signal Not applicable to PRT No random pattern sampling in virtual scene Must sample one ray at a time [Peers et al. 09] [Sen and Darabi. 09] Compressive sensing is an interesting research topic to sparsely sample the data. Canades et al have shown that a signal with length N can be *recovered* with klogN samples, if it has a k-sparse representation (click) The result is applied to appearance acquisition, by Peers et al. And Sen and Darabi Sen and Darabi also has a new paper this year showing up later In this section (click) However, the method is not … because there is no random sampling pattern in virtual scene, Where we use ray-tracing to trace one direction at a time ----- So probably we need something that applies to point samples

Related Work Row-column sampling Column selection is fixed across all rows R C A B = CA+R Row-column sampling is a method that we can use, since it simply sample certain row/column cells , then recover/approximate the entire matrix (click) Here is the work by Hasan et al. (click) They sparsely sample the matrix for many light contribution P. (click) They first sample some rows, and (click) by finding clusters, they use cluster centers to represent the matrix (click) Wang et al. use NYSTROM method sparse sample the matrix (click) With certain rows and column sampled, (click) They can fully recover the matrix These methods are very close reference to our work, (click) But they use the fixed column selection across all rows. [Hasan et al. 07] [Wang et al. 09]

Related Work Hierarchical and Sparse Sampling Adaptive Methods [Kontkanen et al. 06] [Hasan et al. 07] [Lehtinen et al. 08] [Krivanek and Gautron 09] Adaptive Methods [Guo 98] [Krivanek et al. 04] [Krivanek et al. 04] [Krivanek and Gautron 09] Finally I want to reference some other work in offline rendering, In the sense of sparse-sampling-interpolation, Our method is also related to irradiance caching, And some other work However, we focus on drastically changing high frequency shadows, While these method focused on smooth indirect illumination, (click) Finally the adaptive remeshing can be seen as compliment way to our method. These methods explore the spatial coherence, but the angular coherence is not utilized

Outline Algorithm Motivation / Introduction Related Work Results Conclusion / Future Work So after reviewing some previous work, lets talk about our method

Algorithm Outline Overview Dense Vertex Sampling Sparse Vertex Sampling Integrating Clustered PCA I will start with simple overview

Overview - 2 Phase Sampling Dense vertex Spatial: 20%~25% Sparse vertex Spatial: 75%~80% Angular: ~30% Our method use a two-phase sampling strategy It separate the scene in 2 kinds of vertex (click) The first kind is call dense vertex, we sample their angular direction fully or densely (click) And for the second kind called sparse vertex, (click) we sample their angular direction sparsely, (click) by using information from neighboring dense vertices, we can reconstruct it later ------------ (click) Around 20-25% of the vertices are dense, whose angular sampling rate are around 30% (click) The rest 75-80% are sparse, their angular sampling rate accounts for only 5-7% If we view the problem in a row-column sampling way Angular: 5%~7%

Overview - 2 Phase Sampling Row-column sense Angular: ~30% Spatial: 20%~25% =6%~7% Dense Vertex Spatial: 75%~80% =4%~5% Sparse Vertex Then given a matrix, (click) we first sample 20-25% rows for dense vertex (click) with the column sampling rate ~30% (click) which account for 6-7% total cost (click) for the rest 75-80% rows are sparse vertices (click) we sample columns with 5-7%, (click) with total cost 4-5% ---- In next few slides I will explain the sampling for dense vertex Angular: 5%~7%

Algorithm Outline Dense Vertex Sampling Overview Where? How? What? Sparse Vertex Sampling Integrating Clustered PCA So in this section, I will describe for dense vertex: Where to choose them How to choose? And what angular direction to trace ray

Dense Vertex Distribution Observation from CPCA Non-uniform cluster sizes Large cluster: low rank Small cluster: high rank Non-uniform sampling But how? We begin by the observation from CPCA, And you can see that the clusters are not uniform in size. (click) after further inspection, we find out large cluster has low rank, and small cluster has high rank (click) this give us hint that we should have more samples at high rank area So how to sample? Without prior knowledge, we don’t have the rank information in advance?

Dense Vertex Spatial Sampling Sampling by exploration 1st iteration Uniform Local rank -> probability 2nd iteration And so on…. Zoom-Up High rank area Here we get the information during the sampling iterations. (click) 1st we have uniform sampling, using these samples to calculate local rank. We use the rank to assign probability for the next iteration sampling (click) *Notice the red box*, which is the high rank area, so that this area is more likely to be sampled. (click) In second iteration, we again calculate rank for next iteration probability And so on. (click**) here is the zoom up, We can see samples concentrate to high rank regions. After I describe how to sample spatially, I will describe what angular direction to sample [2nd iteration] [3rd iteration] [4th iteration] [1st iteration]

Dense Adaptive Angular Sampling 1st pass: regularly 2nd pass: adaptively If values are inconsistent 50%-70% savings We use adaptive angular sampling to find features (click)We first sample regularly at lower resolution (click) For all other directions, (click) we check the sampled value in the window (click) if these value are inconsistent, (click) we simply sample that direction (click) and by this we can sample directions around the feature (click) this gives around 50~70% saving

Sparse Vertex Sampling Algorithm Outline Overview Dense Vertex Sampling Sparse Vertex Sampling Angular Sampling Local Reconstruction Integrating Clustered PCA So after sampling the dense vertex in spatial and angular domain, then I want to show how to deal with the sparse vertex

Sparse Vertex Angular Sampling Remember in overview… The sparse set of important angular features In overview, we’ve shown that for sparse vertex, (click) we sample their angular direction sparsely, (click) and use neighboring dense vertices for reconstruction, this section I will show how to do that (click) The first problem is to determine … How can we start? (click) Basically we can start from the neighboring dense vertices (click) Here from previous feature adaptive sampling, we know where the features are… (click) So we will select from the union of all these feature direction. But there are too many of them, (click) So we use variance to cut-off unimportant features. (click) and here is the final selection So once we’ve selected the sparse set of angular features, I’ll describe how to reconstruct the transport function Weighted selection, 5%~7% Neighboring angular features The light transport from its neighboring dense vertices Union of these features, but too many Variances of neighboring angular features

Sparse Vertex Reconstruction Reconstructed light transport ? Sampled directions as constraints Here we use a very simple linear system for reconstruction The assumption is that: the transport function should be some linear combination from the neighboring dense vertices. Such that (click)the sampled directions should (click)Equals to the linear combination of the *selected directions from dense neighbors* (click)So by solving for alpha, we can recover the light transport. The final Q is how many neighbors should we use?

Sparse Vertex Reconstruction How many neighbors? In reality L1 sparse solver [Kim et al. 07] No exact radius needed (click) Here we’ve experimented with increasing radius to include more neighbors (click) Ideally, if we have full information on how to interpolate the sparse vertex the green curve is something we would expect!!!! (click) In reality, we’ve got blue curve with least squares method, with increasing error, Because it over fit to the sparse constraints. (click) Inspired by recent development in sparse reconstruction, we used L1 solver, and have lower error rate for the red curve And because of the sparse nature, we don’t need to specify the exact radius.

Integrating Clustered PCA Algorithm Outline Overview Dense Vertex Sampling Sparse Vertex Sampling Integrating Clustered PCA So here after the sampling for D and S V we further found that the current algorithm fit nicely with CPCA

Clustered PCA Incrementally adding bases [Sloan et al. 03] Avoid local minimum Linear combination assumption! run clustering on *dense* vertices 1. for each PCA basis for each LBG iteration end for run clustering on *all* vertices The original CPCA algorithm is proposed by sloan et al 03. In order to avoid local minimum, they use incremental bases addition. (click) Their method runs a nested loop on *all vertices*, and becomes very expensive. (click) Our observation is that sparse V are linear comb from D V, (click) So we only run the nested loop for dense vertex, And you can see the clustering for dense vertex on the left Then we assign the rest sparse vertices to their nearest cluster, And run the clustering just once The result is shown on the left The result give us great amount of speed up, that we will show later Assign sparse vertex to nearest cluster 2. Run inner loop for all vertices 3.

Outline Results Motivation / Introduction Related Work Algorithm Error analysis Performance Conclusion / Future Work Here I will show the error comparison and performance

Analysis: Angular Sampling Adaptive sampling Non-adaptive angular sampling We first compare our angular sampling strategy with related previous work So Here is Wang et al (click) And here is Hasan et al (click) Both methods non-adaptive… Which means their angular selection is fixed, (click) So a lot of samples are wasted. (click) our method use adaptive sampling, (click) where samples are sent to really important regions [Ours] [Wang et al. 09] [Hasan et al. 07] Sampling important directions A lot of directional samples are wasted

Analysis: L2 Error for Bunny 35K + 29K vertices Scanned High error at seams For the error analysis, The bunny is a scanned model, vertices are distributed evenly. Notice that there are some high error rate at the seams But in general, error is low everywhere and un-noticeable

Analysis: L2 Error for Horse 8.5K + 29K vertices Low resolution Man made On the other hand, we have a fairly simplified mesh. This is a low-resolution and man-made model, which means vertex distribution is not uniform. A miss in important region can be large. However the error is still low in general and not noticeable in the rendering

Performance Precomputation time only Rendering is real-time with the same quality Model Horse Dancer Bunny CPCA New CPCA Speed Up 52m 4m 12.6x 45m 3m 81m 6m 12.8x Full Sampling Sparse Sampling Sampling Rate Speed Up 4h 2m 54m ~11% 4.47x 5h 40m 1h 15m 4.77x 13h 8m 3h 26m 3.83x Now I want to show the table for precomputation time. The rendering is real-time with the same quality (click) So for the fully sampled version, it took 4-13 hours to sample (click) Our sparse method only samples 11%, the time reduced to 1-3 hours. (click) Achieving 3-4x speed up (click) here is the time for CPCA, we achieve 12x speed up.

Performance for Glossy Objects In-Out Factorization for glossy BRDFs [Liu et al. 04] [Wang et al. 04] Model Size Full Sampling Sparse Sampling Rate Speed Up Armadillo 55K 10h 7m 2h 17m 11.43% 4.75x Buddha 13h 22m 2h 36m 11.11% 5.12x Dragon 10h 31m 2h 1m 11.16% 5.22x Bench 50K 7h 7m 1h 56m 18.54% 3.67x Here is another performance table for glossy objects We use in-out Fact. for glossy BRDF (click) the size of the scene range from 50-55K vertices, (click) the precomputation time range from 7-13 hours (click) again with our method, we can sample in about 2 hours (click) Notice that the bench scene has lots of high frequency features, so we sent more samples for this particular scene. (click) finally we can still do 3-5x faster.

Precomputation Time Bench Scene 3.6x speed up in precomputation This is the bench scene I’ve just mentioned The is one extreme scene that we have for stressed test. Notice that there are a lot of high frequency features, Vide-dependent high light, and intricate shadows, Although this is a very difficult scene for sparse sampling, we can still precompute within 2 hours, Resulting in 3.6x sped up Sparse Sampling Full Sampling 1 hours 56mins 7 hours 7 mins

Detailed Timing Here is the system time breakdown, (click) most cost are spent on ray-tracing And you can see our overhead is actually minimum ----------------------------- In addition, although in our case 10% sampling accounts for 5x speedup, these are rays sent to important region, that actually need more attention, As our algorithm is designed.

Conclusion PRT research on real-time functionality Precomputation is often the bottleneck Adaptive and sparse sampling Exploit both spatial and angular coherence Accelerated Clustered PCA compression Sparse precomputation possible 5x speed-up 12x speed-up computing CPCA So to conclude my talk, The current PRT research still focus on new functionalities But the precomputation is still the bottle neck prevent people using it (click) In this paper we propose an adaptive and sparse sampling scheme That take advantage in both S and A coherence Also we’ve show that CPCA can be integrated with the new scheme (click) Finally we show 5x speed up in sampling And 12x speed up in compression.

Future Work GPU acceleration New capabilities Interactive precomputation New capabilities Rapid prototyping Lighting design Dynamic scenes General theory of sparse sampling Avoid heuristic parameter tuning Broader context Appearance acquisition Offline rendering For the future work, Since most of our system time are still ray-tracing, GPU can help to accelerate that, and we can expected an interactive precomputation (click) For the new capability, the fast precomputation enables a faster Rapid prototyping… Also temporal coherence can be utilized for dynamic scene (click) In addition, we would like to see some general theory for sparse sampling in the rendering community, Since most parameters are heuristically tuned (click) In a broader context, sparse sampling is also important in Appearance acquisition and offline rendering And we would like to see more of them.

The End Acknowledgements Ryan S. Overbeck Anonymous reviewers NSF CAREER Grant IIS-0924968 ONR PECASE N0001409-1-0741 Intel NVIDIA Adobe Pixar