Download presentation

Published byLuis Kelley Modified over 4 years ago

1
**Spherical Skinning with Dual-Quaternions and QTangents**

Ivo Zoltan Frey Crytek R&D

2
**#2 Reduce the memory foot-print of skinned geometry**

Goals #1 Improve performance by reducing the shader constant requirements for joint transformations 30% shader constants reduction #2 Reduce the memory foot-print of skinned geometry 22% vertex memory reduction 29% for static geometry

3
Skinned Geometry

4
Goal #1 Improve performance by reducing the shader constant requirements for joint transformations Skinned geometry requires multiple passes Motion Blur requires twice the transformations The amount of required shader constants affects the performance of a single pass

5
**Skinning with Quaternions**

~30% less shader constants consumption compared to 4x3 packed matrices Quaternion Linear Skinning Accumulated transformations don’t work for positions Explosion of vertex instructions Quaternion Spherical Skinning [HEIJL04] Extra vertex attribute required Doesn’t handle well more than 2 influences per vertex Dual-Quaternion Skinning [KCO06] [KCZO08] Increase in vertex instructions [HEIJL04] also requires careful re-rigging of the skinned geometry.

6
**Dual-Quaternion Skinning [KSO06] [KCZO08]**

Compared to Linear Skinning with matrices Accumulation of transformations is faster Applying the transformation is slower With enough influences per vertex it becomes overall faster The reduction of shader constants was a win over the extra vertex instructions cost We offload antipodality handling on the CPU by making sure the child is in the same hemisphere as the parent when we generate the dual-quaternion transformations. for (int i=0; i<jointCount; ++i) { int p = GetJointParentIndex(i); if (p < 0) continue; DualQuat& dq = GetJointTransformation(i); const DualQuat& pdq = GetJointTransformation(p); if (dot(dq, pdq) < 0.0f) SetJointTransformation(i, -dq); }

7
**From Linear to Spherical**

Geometry needs to be rigged differently And you will still need your helper joints Riggers and Animators need to get used to it Some will love it, others will hate it Most will keep changing their mind You might have to write skinning plug-ins for third party authoring software Some recent authoring packages have adopted Dual-Quaternion Skinning out of the box Although Spherical Skinning techniques claim to reduce the need for deformation helper joints, we found that for our production geometry we still require them to reach our target quality. Given enough processing power the ideal solution would be to arbitrary chose whether to use Linear or Spherical Skinning on a per joint basis.

8
**Reduce the memory foot-print of skinned geometry**

Goal #2 Reduce the memory foot-print of skinned geometry We are now developing on consoles, every byte counts! More compact vertex format will also lead to better performance Do not sacrifice quality in the process!

9
**In further optimizing them we need to ensure that**

Tangent Frames Tangent Frames were the biggest vertex attribute after our trivial memory optimizations In further optimizing them we need to ensure that They keep begin efficiently transformed by Dual-Quaternions All our Normal Maps keep working as they are Our trivial memory optimizations mostly involved moving from 32bit floating-point formats to 16bit floating-point and integer formats.

10
**Please make them orthogonal! **

About Tangent Frames Please make them orthogonal! If they are not, you are introducing skewing You can’t use a transpose to invert the frame matrix You need a full matrix inversion This will also prevent you from using some compression techniques! Most geometry in modern productions are made of unique uv atlases, at times shared through the use of mirroring to mitigate memory constraints. For performance reasons the welding of vertices is also encouraged, which at times sacrifices smoothing accuracy across vertex normal. As long as the exact inverse of the transformation used to bring object space normals into tangent space is used to bring them back into object space, the correctness of the frame as far as exactly following the texture space is not required. Geometry where more accurately following the texture space is desirable turns out to mostly already yield to a orthogonal frame.

11
**Compressed Matrix Format**

Vertex attributes contain two of the frame’s vectors and a reflection scalar The third frame’s vector is rebuild from a cross product of the given vectors and a multiplication with the reflection scalar normal = cross(tangent, biTangent) * s Tangent BiTangent Reflection x y z s Industry standard for storing a compressed form of Tangent Frames. Note that if skewing is present on the normal (i.e. resulting form vertex welding for performance optimization), this compression technique won’t yield to the correct results!

12
**Tangent Frames With Quaternions**

Quaternion to Matrix conversion t = transform(q, vec3(1, 0, 0)) b = transform(q, vec3(0, 1, 0)) n = transform(q, vec3(0, 0, 1)) Quaternions don’t natively contain reflection information

13
**Bringing Reflection Into the Equation**

Similarly to the compressed matrix format, we can introduce reflection with a scalar value t = transform(q, vec3(1, 0, 0)) b = transform(q, vec3(0, 1, 0)) n = transform(q, vec3(0, 0, 1)) * s

14
**Tangent Frame Format Memory Comparison**

Compressed Matrix Tangent BiTangent Reflection x y z s Quaternion Reflection x y z w s

15
**Our Quaternion Properties**

They are normalized length(q) == 1 And they are sign invariant q == -q

16
**Quaternion Compression**

We can compress a Quaternion down to three elements by making sure one of the them is greater than or equal to zero if (q.w < 0) q = -q We can then rebuild the missing element with q.w = sqrt(1 – dot(q.xyz, q.xyz))

17
**Tangent Frame Format Memory Comparison**

Compressed Matrix Tangent BiTangent Reflection x y z s Quaternion Reflection x y z w s Compressed Quaternion Quaternion Reflection x y z s Note that if 16bit formats are used the Compressed Matrix format will require 8 elements while the Quaternion format will require 6 elements.

18
**Instruction Cost Quaternion decompression 5 mov, dp3, add, rsq, rcp**

Quaternion to Tangent and BiTangent 6 add, mul, mad, mad, mad, mad Normal and Reflection computation 3 mul, mad, mul Total 11 for Tangent, BiTangent and Reflection 14 for full Tangent Frame

19
**Avoiding Quaternion Compression**

Isn't there a way to encode the reflection scalar in the Quaternion, instead of compressing it? Remember, Quaternions are sign invariant q == -q We can arbitrarily decide whether one of its elements has to be negative or positive!

20
Encoding Reflection First we initialize the Quaternion by making sure q.w is always positive if (q.w < 0) q = -q If then we require reflection, we make q.w negative by negating the entire Quaternion if (reflection < 0) q = -q

21
Decoding Reflection All we have to do in order to decode our reflection scalar is to check for the sign of q.w reflection = q.w < 0 ? -1 : +1 As for the Quaternion itself, we can use it as it is! q = q Note that the comparison with “< 0” will return false even in floating-point values that are “-0”. On CPU the sign bit needs to be explicitly set and checked for to avoid the need for the bias described in the upcoming slides. This is purposely left wrong in order to lead into the upcoming singularity handling slides.

22
**Instruction Cost Reflection decoding 2 slt, mad**

Quaternion to Tangent and BiTangent 6 add, mul, mad, mad, mad, mad Normal and Reflection computation 3 mul, mad, mul Total 8 for Tangent, BiTangent and Reflection 11 for full Tangent Frame

23
**Tangent Frame Transformation with Dual-Quaternion**

Quaternion-Vector transformation | | float3x3 frame; 6 | frame[0] = transform_quat_vec( | skinningQuat, vertex.tangent.xyz); | 6 | frame[1] = transform_quat_vec( | skinningQuat, vertex.biTangent.xyz); | 2 | frame[2] = cross(frame[0], frame[1]); 1 | frame[2] *= vertex.tangent.w; | 15 instructions Quaternion-Quaternion transformation | 5 | float4 q = transform_quat_quat( | skinningQuat, vertex.qTangent) | 8 | float3x3 frame = quat_to_mat(q); | 3 | frame[2] *= vertex.qTangent.w < 0 ? -1 : +1; | | | | 16 instructions Although the Quaternion-Quaternion transformation is 1 instruction longer, it ends up being faster since it requires 1 less vertex attribute. float3 transform_quat_vec(const float4 quat, const float3 vec) { // mad + mul + mad + mul + mad + mad return vec + cross(quat.yzw, cross(quat.yzw, vec) + quat.w * vec) * 2.0; } float4 transform_quat_quat(const float4 q, const float4 p) float4 c, r; // mul + mad : 2 c.xyz = cross(q.xyz, p.xyz); // dot : 1 c.w = -dot(q.xyz, p.xyz); // mad : 1 r = p * q.w + c; r.xyz = q * p.w + r; return r; float3x3 quat_to_mat(const float4 quat) // add : 1 alu float4 q2 = quat + quat; // mul : 1 alu float3 r2 = q2.w * float3(-1.0, 0.0, 1.0); // mad : 1 alu float3 r0 = quat.wzy * r2.zxz + float3(-1.0, 0.0, 0.0); r0 = quat.x * q2.xyz + r0; float3 r1 = quat.zwx * r2.zzx + float3(0.0, -1.0, 0.0); r1 = quat.y * q2.xyz + r1; // mul + mad : 2 alu r2 = cross(r0, r1); return float3x3(r0, r1, r2);

24
QTangent Definition A Quaternion of which the sign of the scalar element encodes the Reflection

25
**Stress-Testing QTangents**

By making sure we throw at it our most complex geometry!

28
Singularity Found!

29
Singularity Found! At times the most complex cases pass, while the simplest fail!

30
Singularity Our singularities manifest themselves when the Quaternion’s scalar element is equal to zero Matrix , 0, Quaternion , -1, , 0, 1, , 0, 1 This means the Tangent Frame’s surface is perpendicular to one of the identity’s axis A box aligned with the identity’s axis and requiring reflection will have 3 of its 6 faces manifest our singularity.

31
**Floating-Point Standards**

So what happens when the Quaternion’s scalar element is 0? The IEEE Standard for Floating-Point Arithmetic does differentiate between -0 and +0, so we should be fine! However GPUs don’t exactly always comply to this standard, at times for good reasons

32
**GPUs Floating-Point “Standards”**

GPUs allow vertex attributes to be specified as integers representing normalized unit scalars They are then resolved into Floating-Point values Integers don’t differentiate between -0 and +0, thus this information is lost in the process Behavior on how GPUs handle floating-points -0/+0 can also vastly vary from hardware to hardware.

33
**Handling Singularities**

In order to use integers to encode reflection, we need to ensure that q.w is never zero When we find q.w to be zero, we need to apply a bias

34
**Defining Our Bias Constant**

We define our bias constant as the smallest value that will satisfy q.w != 0 If we are using an integer format, this value is given by bias = 1 / (2BITS-1 – 1)

35
**Applying the Bias Constant**

We need to apply our bias for each Quaternion satisfying q.w < bias, and while doing so we make sure our Quaternion stays normalized if (q.w < bias) { q.xyz *= sqrt(1 - bias*bias) q.w = bias }

36
**QTangents with Skinned Geometry**

Position 4 float16 8 bytes TexCoord 2 float16 4 bytes Tangent 4 int16 BiTangent SkinIndices 4 uint8 SkinWeights From 36 bytes to 28 bytes per vertex ~22% memory saved No overhead with Dual-Quaternion Skinning ~8 instruction overhead with Linear Skinning

37
**QTangents with Static Geometry**

Position 4 float16 8 bytes TexCoord 2 float16 4 bytes Tangent 4 int16 BiTangent From 28 bytes to 20 bytes per vertex ~29% memory saved ~8 instruction overhead

38
**Future Developments Quaternions across polygons**

Interpolating Quaternions across polygons and making use of them at the pixel level Quaternions in G-Buffers Encoding the whole Tangent Frame instead of just Normals Can open doors to more Deferred techniques Anisotropic Shading Directional blur along Tangents

39
Special Thanks Ivo Herzeg, Michael Kopietz, Sven Van Soom, Tiago Sousa, Ury Zhilinsky Chris Kay, Andreas Kessissoglou, Mathias Lindner, Helder Pinto, Peter Söderbaum Crytek

40
References [HEIJL04] Heijl, J., "Hardware Skinning with Quaternions", Game Programming Gems 4, 2004 [KCO06] Kavan, V., Collins, S., O'Sullivan, C., "Dual Quaternions for Rigid Transformation Blending", Technical report TCD-CS , 2006 [KCZO08] Kavan, V., Collins, S., Zara, J., O'Sullivan, C., "Geometric Skinning with Approximate Dual Quaternion Blending", ACM Trans. Graph, 2008

41
Questions?

Similar presentations

OK

University of Texas at Austin CS 378 – Game Technology Don Fussell CS 378: Computer Game Technology Beyond Meshes Spring 2012.

University of Texas at Austin CS 378 – Game Technology Don Fussell CS 378: Computer Game Technology Beyond Meshes Spring 2012.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google