Download presentation

Presentation is loading. Please wait.

Published byKasey Addis Modified over 2 years ago

1
Numerical-Precision-Optimized Volume Rendering Ingmar Bitter Neophytos Neophytou Klaus Mueller Arie Kaufman

2
Numerical-Precision-Optimized Volume Rendering Ingmar Bitter Neophytos Neophytou Klaus Mueller Arie Kaufman

3
Outline Numerical precision - a rendering resource

4
Outline Numerical precision - a rendering resource Fixed-point arithmetic

5
Outline Numerical precision - a rendering resource Fixed-point arithmetic Reverse order precision analysis –Compositing, shading, gradients, classification, sampling/splatting, sample/splat location

6
Outline Numerical precision - a rendering resource Fixed-point arithmetic Reverse order precision analysis –Compositing, shading, gradients, classification, sampling/splatting, sample/splat location Results

7
Outline Numerical precision - a rendering resource Fixed-point arithmetic Reverse order precision analysis –Compositing, shading, gradients, classification, sampling/splatting, sample/splat location Results Conclusions

8
Numerical Precision: A Resource Double precision computation for all – ideal?

9
Numerical Precision: A Resource Double precision computation for all – ideal? –slower then all other alternatives –not possible on graphics cards (at least for now) –expensive on custom chip implementations –and most importantly: not needed to create best possible images!!

10
Numerical Precision: A Resource Double precision computation for all – ideal? –slower then all other alternatives –not possible on graphics cards (at least for now) –expensive on custom chip implementations –and most importantly: not needed to create best possible images!! reasons: predominantly 8-bit displays (per channel) limited range intervals throughout

11
Current Status Stable volume rendering pipeline: both CPU and GPU [LL94, Lev88, MJC02, Wes90, EKE01, RSEB00] Interpolation before classification, even for splatting [MMC99] Caching optimized for volume rendering [Kni00, LCCK02, PSL98] Precision-limited rendering systems: ATI, NVidia, VolumePro [PHK99], VizardII [MKW02], UltraVis [Kni00] Completely fixed: final output image display bit precision –8 bits per RGB color channel on CRTs and LCDs –8 bits max in DVI standard –SGIs 12 bit color displays are nearly extinct –Radiologists’ requirements are not mass market, same analysis applies

12
OpenGL Arithmetic: 1 2 =1? Representation [0, 255] a = b = 255 Computation = a [0, 255] × b [0, 255] >> 8; = 254 wrong 1 mult, one shift Alternatively: tmp = a [0, 255] × b [0, 255] + 128; result = (tmp+(tmp >> 8)) >> 8; = 255, correct [Bli95] 1 mult, 2 adds, 2 shifts

13
OpenGL Arithmetic: 1 2 =1? Representation: fixed-point I.Fb –I.Fb = I integer bits, F fraction bits 8 bits 1.7b fixed point number then a = b = 1 1.7b = 128 Computation = a 1.7b × b 1.7b >> 7 = 128 correct 1 mult, one shift one fewer bit of resolution, but OK (we will see)

14
Reverse Order Precision Analysis Unified ray casting and splatting pipelines Composite creates the final image Sample LocationSplat Location SampleSplat Classify Gradient Shade Composite Ray CastingSplatting

15
Reverse Order Precision Analysis Unified ray casting and splatting pipelines Composite creates the final image Precision requirements propagate backwards Sample LocationSplat Location SampleSplat Classify Gradient Shade Composite Ray CastingSplatting

16
Compositing - Math Pre-(alpha)-multiplied colors: –C = αC = αR, αG, αB Alpha correction (r samples per unit): –T corrected = (1- α) r

17
Compositing - Math Pre-(alpha)-multiplied colors: –C = αC = αR, αG, αB Alpha correction: –T corrected = (1- α) r With back-to-front compositing: –C CompositingBuffer ×= T corrected += C front –T CompositingBuffer ×= T corrected; α CompositingBuffer = 1-T corrected –perform multiplication N times per pixel correct solution needs N × F × r bits precision T/C CompositingBuffer T corrected, C front T/C CompositingBuffer

18
Compositing – Precision Theory 8-bit destination resolution –therefore all partial results can be rounded –drop all bits not contributing to the 8 most significant bits (MSB) Adding N = 2 p samples –allows 8+p bits to influence the 8 MSB Conversion from α CompositingBuffer C to C for display (division) –allows 8+p more bits to influence the 8 MSB Conversion from α corrected C to C for display –allows r times as many bits to influence the 8 MSB Sufficient resolution is: r × 2 × (8+p) for C, r × (8+p) for α –32/16 bits for C/α CompositingBuffer for volumes and no super-sampling –608 bits for ×2048 volumes and 16 samples per voxel

19
Compositing – Precision Theory 8-bit destination resolution –therefore all partial results can be rounded –drop all bits not contributing to the 8 most significant bits (MSB) Adding N = 2 p samples –allows 8+p bits to influence the 8 MSB Conversion from α CompositingBuffer C to C for display (division) –allows 8+p more bits to influence the 8 MSB Conversion from α corrected C to C for display –allows r times as many bits to influence the 8 MSB Sufficient resolution is: r × 2 × (8+p) for C, r × (8+p) for α –32/16 bits for C/α CompositingBuffer for volumes and no super-sampling –608 bits for ×2048 volumes and 16 samples per voxel

20
Compositing – Precision Theory 8-bit destination resolution –therefore all partial results can be rounded –drop all bits not contributing to the 8 most significant bits (MSB) Adding N = 2 p samples –allows 8+p bits to influence the 8 MSB Conversion from α CompositingBuffer C to C for display (division) –allows 8+p more bits to influence the 8 MSB Conversion from α corrected C to C for display –allows r times as many bits to influence the 8 MSB Sufficient resolution is: r × 2 × (8+p) for C, r × (8+p) for α –32/16 bits for C/α CompositingBuffer for volumes and no super-sampling –608 bits for ×2048 volumes and 16 samples per voxel

21
Compositing – Precision Theory 8-bit destination resolution –therefore all partial results can be rounded –drop all bits not contributing to the 8 most significant bits (MSB) Adding N = 2 p samples –allows 8+p bits to influence the 8 MSB Conversion from α CompositingBuffer C to C for display (division) –allows 8+p more bits to influence the 8 MSB Conversion from α corrected C to C for display –allows r times as many bits to influence the 8 MSB Sufficient resolution is: r × 2 × (8+p) for C, r × (8+p) for α –32/16 bits for C/α CompositingBuffer for volumes and no super-sampling –608 bits for ×2048 volumes and 16 samples per voxel

22
Compositing – Precision Theory 8-bit destination resolution –therefore all partial results can be rounded –drop all bits not contributing to the 8 most significant bits (MSB) Adding N = 2 p samples –allows 8+p bits to influence the 8 MSB Conversion from α CompositingBuffer C to C for display (division) –allows 8+p more bits to influence the 8 MSB Conversion from α corrected C to C for display –allows r times as many bits to influence the 8 MSB Sufficient resolution is: r × 2 × (8+p) for C, r × (8+p) for α –32/16 bits for C/α CompositingBuffer for volumes and no super-sampling –608 bits for ×2048 volumes and 16 samples per voxel

23
Compositing – Precision Practice No alpha correction (r = 1): 2 × (8+p) bits Iso-surface rendering using “old fashioned” OpenGL: –store not αC but C in frame buffer: (8+p) –bright colors: 5+p –at most 8 non-zero samples per ray (p=3): 5+3=8 bits standard 24 bit RGBA frame buffer is adequate Fog visualization –what matters is the ability to see objects though volumetric fog (substance with low opacity) –visual experiments show 15 fractional bits are sufficient

24
Compositing – Precision Practice No alpha correction (r = 1): 2 × (8+p) bits Iso-surface rendering using “old fashioned” OpenGL: –store not αC but C in frame buffer: (8+p) –bright colors: 5+p –at most 8 non-zero samples per ray (p=3): 5+3=8 bits standard 24 bit RGBA frame buffer is adequate Fog visualization –what matters is the ability to see objects though volumetric fog (substance with low opacity) –visual experiments show 15 fractional bits are sufficient

25
Compositing – Precision Practice No alpha correction (r = 1): 2 × (8+p) bits Iso-surface rendering using “old fashioned” OpenGL: –store not αC but C in frame buffer: (8+p) –bright colors: 5+p –at most 8 non-zero samples per ray (p=3): 5+3=8 bits standard 24 bit RGBA frame buffer is adequate Fog visualization –what matters is the ability to see objects though volumetric fog (substance with low opacity) –visual experiments show 15 fractional bits are sufficient

26
Compositing – Precision Practice No alpha correction (r = 1): 2 × (8+p) bits Iso-surface rendering using “old fashioned” OpenGL: –store not αC but C in frame buffer: (8+p) –bright colors: 5+p –at most 8 non-zero samples per ray (p=3): 5+3=8 bits standard 24 bit RGBA frame buffer is adequate Fog visualization –what matters is the ability to see objects though volumetric fog (substance with low opacity) –visual experiments show 15 fractional bits are sufficient

27
Compositing – Precision Practice No alpha correction (r = 1): 2 × (8+p) bits Iso-surface rendering using “old fashioned” OpenGL: –store not αC but C in frame buffer: (8+p) –bright colors: 5+p –at most 8 non-zero samples per ray (p=3): 5+3=8 bits standard 24 bit RGBA frame buffer is adequate Fog visualization –what matters is the ability to see objects though volumetric fog (substance with low opacity) –visual experiments show 15 fractional bits are sufficient

28
Compositing – Precision Practice No alpha correction (r = 1): 2 × (8+p) bits Iso-surface rendering using “old fashioned” OpenGL: –store not αC but C in frame buffer: (8+p) –bright colors: 5+p –at most 8 non-zero samples per ray (p=3): 5+3=8 bits standard 24 bit RGBA frame buffer is adequate Fog visualization –what matters is the ability to see objects though volumetric fog (substance with low opacity) –visual experiments show 15 fractional bits are sufficient

29
Compositing – Conclusion Preferred bit-aware back-to-front compositing equations: αC 1.15b ×= T 1.15b sample += C 1.15b sample T 1.15b ×= T 1.15b sample Least-significant-bit-fog at various bit precisions dataset r = 2

30
Shading - Math Phong C color = k ambient O objectColor I lightIntensity + k diffuse O Σ i { I i (NL i ) } + k specular Σ i { I i (RL i ) r } k є [0,1] k ambient + k diffuse + k specular =1 O objectColor (8 bit) and I lightIntensity є [0,1] NL i and RL i є [-1,1], but є [0,1] after clamping Phong C color = є [0,1] (possibly clamping Σ i )

31
Shading - Analysis Phong C color needs to be as precise as 1.15b Use 16.16b for all multiplications [0,1)× [0,1] –sufficient precision and no overflow

32
Shading – New Computation Replace specular exponentiation with recursive multiplies –repeatedly multiply number with itself –works for all exponents r=2 n –when r=2 6 (16 bit precision), then max error < 0.005% –better results than Knittel’s parabola approximation

33
Shading – New Computation Replace specular exponentiation with recursive multiplies –repeatedly multiply number with itself –works for all exponents r=2 n –when r=2 6 (16 bit precision), then max error < 0.005% –better results than Knittel’s parabola approximation pow r=2 n Knittel’s parabola

34
Shading - Conclusion Preferred bit-aware Phong shading equation: C 16.16b = k 16.16b ambient O 0.8b objectColor I 16.16b light + k 16.16b diffuse O 0.8b Σ i { I 16.16b i (N 16.16b L 16.16b i ) } + k 16.16b specular Σ i { I 16.16b i (R 16.16b L 16.16b i ) 2^n }

35
Gradients - Math G x = 0.5 sample (x+1,y,z) sample (x-1,y,z) G y = 0.5 sample (x,y+1,z) sample (x,y-1,z) G y = 0.5 sample (x,y,z+1) sample (x,y,z-1)

36
Gradients - Analysis G = G 1.Fb Discrete nearest gradient vector neighbors –sin φ = 1/2 F, sin φ ≈ φ → φ ≈ 1/2 F Maximum error for specular intensity, large r –r = 64, 1 64 != 1, but 1 64 = (1- 1/2 F ) 64 –error of 22%, 6.1%, 1.6%, 0.4% for F of 8, 10, 12, 14 φ

37
Gradients - Analysis sized spheres with Phong highlights 4, 6, 8, 10, 12, 14 bit gradients Diffuse artifacts for 4 and 6 bits Specular artifacts up to 10 bits

38
Gradients - Conclusion Thus, 12 bits dynamic range is needed Now consider normalization: –reduces I.Fb to 1.Fb –up to I bits will be added to the fractional part Volume samples often have 12 bits G x,y,z with 12.12b minimum representation G x,y,z with 16.16b preferred representation –leaves room for interpolation bits in normalization

39
Classification – Prelims and Recaps Use of T instead of α is more efficient in compositing operation Largest visual precision/quantization error occurs at high transparencies (low opacities) –need more bits for T than for C, just to be sure Want transfer function lookup table to be cache- friendly –power-of-2 RGBA-tuple alignment Would like to use pre-integrated classification for color and opacity transfer functions [EKE01, MGS02]

40
Classification – Prelims and Recaps Use of T instead of α is more efficient in compositing operation Largest visual precision/quantization error occurs at high transparencies (low opacities) –need more bits for T than for C, just to be sure Want transfer function lookup table to be cache- friendly –power-of-2 RGBA-tuple alignment Would like to use pre-integrated classification for color and opacity transfer functions [EKE01, MGS02]

41
Classification – Prelims and Recaps Use of T instead of α is more efficient in compositing operation Largest visual precision/quantization error occurs at high transparencies (low opacities) –need more bits for T than for C, just to be sure Want transfer function lookup table to be cache- friendly –power-of-2 RGBA-tuple alignment Would like to use pre-integrated classification for color and opacity transfer functions [EKE01, MGS02]

42
Classification – Prelims and Recaps Use of T instead of α is more efficient in compositing operation Largest visual precision/quantization error occurs at high transparencies (low opacities) –need more bits for T than for C, just to be sure Want transfer function lookup table to be cache- friendly –power-of-2 RGBA-tuple alignment Would like to use pre-integrated classification for color and opacity transfer functions [EKE01, MGS02]

43
Classification - Math Desired lookup table entries: R 1.8b G 1.8b B 1.8b T 1.16b 5.5 bytes Common lookup table entries: R 0.8b G 0.8b B 0.8b α 0.8b 4 bytes

44
Classification - Math Desired lookup table entries: R 1.8b G 1.8b B 1.8b T 1.16b 5.5 bytes Common lookup table entries: R 0.8b G 0.8b B 0.8b α 0.8b 4 bytes Better lookup table entries: R 0.8b G 0.8b B 0.8b sqrt(α) 0.8b spreads low α Computed lookup after T = 1-(sqrt(α) 2 ): R 0.8b G 0.8b B 0.8b T 1.16b squaring doubles precision

45
Classification - Conclusion Preferred bit-aware lookup table entries: R 0.8b G 0.8b B 0.8b sqrt(α) 0.8b Foot with least-significant-thin-tissue-fog α 0.8b sqrt(α) 0.8b α 0.16b

46
Sample Interpolation - Math sample = voxel 0 × (1-w) + voxel 1 × w sample = w × (voxel 1 - voxel 0 ) + voxel 0 Requirements: –G x,y,z, derived from samples, need 12 bit dynamic range –samples need 12 bit values for transfer function lookup –cover both low and high dynamic range neighborhoods Therefore, sample 12.12b is a minimum requirement –integer part comes from voxels voxel 12.0b –fractional part comes from interpolation w 1.12b

47
Sample Interpolation - Conclusion Preferred bit-aware sample interpolation: sample 12.12b = w 1.12b × (voxel b - voxel b ) + voxel b Splats start on voxels, need no interpolation: splat 12.0b = voxel 12.0b

48
Sample Location - Math k-th sample location = startPos + Σ k V inc Perspective rays need to differ enough to allow 1024 rays across 60 degrees, or 0.05 ◦ –sin φ = (k 1/2 F ) / k, sin φ ≈ φ → φ ≈ 1/2 F –F = 6, 12, 16 → φ = 0.9 ◦, 0.05 ◦, ◦ Also, need to address 2048 slices (integer positions) → 11bits Thus, need overall 11.12b φ k

49
Sample Location - Conclusion Preferred bit-aware sample location: –perspective projection: sampleLocation 11.12b = startPos 11.12b + Σ V inc 1.12b –parallel projection: sampleLocation 11.6b (0.9 ◦ OK)

50
Splat Scan Conversion - Math Splats project onto image grid → reverse rays Allow as many as 2048 splat rays across 60 degrees, or ◦ Hence, twice the ray casting precision –one extra fractional bit F=13 Also address 2048 slices (11bits) Thus, need overall 11.13b φ

51
Splat Scan Conversion - Conclusion Preferred bit-aware splat scan conversion: splatLocation 11.13b = startVoxelPos 11.13b + Σ V inc 1.13b Splats are usually pre-transformed and stored in bucket lists (one per sheet-buffer) Preferred voxel location sheet buffer format x 11.13b u 8.0b y 11.13b v 8.0b (64 bits total) –x, y: location on splat plane –u: index into pre-integrated splat table –v: voxel value (x, y) y) u

52
Results Summary of minimum precision requirements Rendering StageInputOutput Sample locationsN/A11.12b Sample interpolation12.00b12.12b Classification12.00b4× 0.8b Gradients12.12b1.12b Shading1.12b1.15b Compositing1.15b

53
Results Restricted iso-surface rendering: –texture map volume rendering can be done using plain OpenGL or Direct X and 8 bit frame buffers General volume rendering, all pipeline stages: –32 bit single precision floating point format –16.16b fixed point format (up to 4x faster in our tests) Pentium allows 2 simple 32-bit integer ops per clock cycle

54
Conclusions 8 bits per RGB channel on final display Analysis of requirements by back propagation Sufficient precision computations using –either 32 bits single precision floating point format –or 16.16b fixed point format Voxel location sheet buffer x 11.13b u 8.0b y 11.13b v 8.0b Transfer functions stored as R 0.8b G 0.8b B 0.8b sqrt(α) 0.8b Compositing/fragment buffer R 1.15b G 1.15b B 1.15b T 1.15b

55
Acknowledgements Hewlett Packard Laboratories ONR grant N NSF CAREER grant ACI DOE grant MO-068 Thanks to Tom Malzbender and Michael Meissner for technical discussions. Thanks to Ronald Summers for resources.

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google