# Essential Mathematics for Games Programmers (Fixed/Float Tutorial)

## Presentation on theme: "Essential Mathematics for Games Programmers (Fixed/Float Tutorial)"— Presentation transcript:

Essential Mathematics for Games Programmers (Fixed/Float Tutorial)
Lars Bishop

Essential Math for Games
Number Spaces Cardinal – Positive numbers, no fractions Integer – Pos., neg., zero, no fractions Rational – Fractions Irrational – Non-repeating decimals (,e) Real – Rationals+irrationals Complex – Real + multiple of -1 a+bi Essential Math for Games

Numerical Representations
In graphics, we often deal with Real numbers ( , 1.5, etc) Unlike integers, Real numbers have fractional components There are several common ways of representing these on a computer Essential Math for Games

Essential Math for Games
Approximating Reals When we write a Real number on paper, we generally write only a few digits past the decimal point: 1.5, 3.45 Most reals cannot be represented exactly by a few fractional digits Any written number has inherent precision Essential Math for Games

Finite Representations
Any number in a computer has finite representation As a result, any such representation cannot represent every number exactly When representing numbers, we will always be coping with these limitations We need to understand and limit error Essential Math for Games

Essential Math for Games
Fixed Point Numbers Use integer-like representation Assume that the least-significant bit is some negative power of 2 (⅛, ¼, etc) The “binary point” is in the middle of the number, not after the least-significant bit Bit 7 6 5 4 3 2 1 Value 8 1/2 1/4 1/8 1/16 Essential Math for Games

Fixed-Point Nomenclature
Applications can adjust their precision and range by moving the “binary point” If a fixed point number has M bits of integral precision N bits of fractional precision, It is called an M.N number This is pronounced “M dot N” A 32-bit integer would be “32 dot 0” Essential Math for Games

Essential Math for Games
Fixed Point Benefits Can represent fractional values with only integer arithmetic Simple to use and understand Addition and Subtraction are the same as for integers Multiplication and Division are only slightly different from their integer siblings Essential Math for Games

Fixed-point vs. Floating-point
A 32-bit fixed-point actually has more inherent precision than a 32-bit Floating-point number! Floating-point trades accuracy for range by using some bits for the exponent “Floating point is for the lazy” - John von Neumann (paraphrased) Essential Math for Games

Floating-point ↔ Fixed-point
Assuming an M.N fixed-point system: IntToFix(i) = i << N FloatToFix(f) = (int)(f * 2.0N) Look out for overflow on these! FixToInt(i) = i >> N FixToFloat(i) = ((float)i) * 2.0-N Conversion to int can lose precision Essential Math for Games

Basic Fixed-Point Math
For A and B, both M.N fixed point values, A+B = (int)A + (int)B A-B = (int)A – (int)B where (int)A is simply the fixed point treated bitwise as an integer This works because the binary points line up when A and B are both M.N Essential Math for Games

Multiplication – Basic Idea
Multiplication is a bit more complex, but it is analogous to the grade-school base 10 trick: We multiply the numbers as integers, and then slide the decimal point to the left by 1+1=2 Essential Math for Games

Essential Math for Games
Multiplication We want to compute 0.5f x 1.0f = 0.5f. In 4.4 fixed-point, this is: 1 1 X After the integer multiply, we get (8x16=128): 1 Then, we need to shift right by 4 (not 8 – we don’t want an integer result, we want a 4.4 result, not 8.0) 1 Essential Math for Games

Challenges with Fixed Point
Range (Overflow) In the previous multiplication example, if we compute 1.0x1.0=1.0 using the given method, the intermediate value overflows Precision (Underflow) If we pre-shift numbers down to avoid overflow, we can end up shifting to 0 Extra shifting required for mul/div Essential Math for Games

Essential Math for Games
Fixed Point Help Some CPUs have special instructions ARM9 (common in handhelds) Has a 32-bit X 32-bit → 64-bit multiply Also has similar x 32 → 64 Can avoid overflow or underflow in intermediate values Allows a free shift in every ALU operation Essential Math for Games

Essential Math for Games
Fixed Point Summary Allows fractional math to be done quickly with integer hardware Requires careful range and precision analysis Is the only option on many embedded and handheld devices, which don’t have FPU hardware Essential Math for Games

Floating Point Numbers
Used to represent general non-integers Often thought of as the set of Reals This is far from the truth, as you’ll see Most of this discussion will be about IEEE bit single-precision Known in C/C++ as float Mention of doubles later Essential Math for Games

FP and Scientific Notation
FP is analogous to scientific notation Scientific notation is: ± D.DDDD x 10E, where D.DDDD has a nonzero leading digit D.DDDD has fixed # of fractional digits E is a signed integer value Essential Math for Games

FP/Sci Notation Range/Precision
Precision is not fixed: Precision of 1.000x10-2 is 100 times more fine-grained than that of 1.000x100 Precision and range are related The larger the number, the less precise 32-bit int is not a subset of float Essential Math for Games

FP/Sci Notation Components
Sign Mantissa Normalized – has a nonzero integer digit Limited precision – fixed number of digits Exponent Chosen so that the mantissa is normalized Essential Math for Games

Essential Math for Games
Sign FP numbers have an explicit sign bit Float has both 0.0f and -0.0f By the standard, 0.0f = -0.0f But the bits are different Do not memcmp floats! This is only one of many reasons Essential Math for Games

Essential Math for Games
Exponent Like the exponent in scientific notation But, the exponent’s base is 2, not 10 Stored as a biased number: ExponentBits = Exponent – Bias For float, Bias = 127 For float, -125 ≤ Exponent ≤ 128 Exponent term is Essential Math for Games

Essential Math for Games
Mantissa Represented as 1-dot-23 fixed point Store the 23 fractional bits explicitly Integer bit is implied (“hidden bit”) Generally, mantissa is normalized In other words, mantissa is 1.MantissaBits Similar to the scientific notation standard In the smallest numbers, the integer bit is assumed to be 0 Essential Math for Games

Binary Representation
Put together, the representation is: S=Sign bit E=Exponent bits (8) M=Fractional mantissa bits (23) We will write as A = (SA,EA,MA) S EEEEEEEE MMMMMMMMMMMMMMMMMMMMMMMM Essential Math for Games

Essential Math for Games
Special Values 0: S=0, E=All 0s, M=All 0s -0: S=1, E=All 0s, M=All 0s +∞: S=0, E=All 1s, M=All 0s Ex: 1.0f / 0.0f = +∞ −∞: S=1, E=All 1s, M=All 0s Ex: -1.0f / 0.0f = -∞ Essential Math for Games

Essential Math for Games
Not a Number Represents undefined results 0.0f / 0.0f = NaN ACOS(2.0f) = Nan Two kinds – Quiet and Signaling Quiet can be passed on to other ops Signaling traps the code NaN: E=All 1s, M=Not all 0s Essential Math for Games

Essential Math for Games
Very Small Numbers What happens when we run out of smaller and smaller exponents? Could flush to zero But, this can lead to the following problem X-Y=0 does not imply X=Y! Need to gradually underflow to zero FP does this by allowing denormals Essential Math for Games

Essential Math for Games
Denormals A denormal is an FP number whose hidden mantissa bit is 0, not 1 Indicated by E=0, M=Not all 0s A denormal is equal to (-1)S x x 0.MantissaBits This allows precision to gradually roll off to zero. Essential Math for Games

To add two positive floating point numbers A (EA, MA) and B (EB, MB): Swap as needed so A has the greater exponent, i.e. EB ≤ EA Shift MB to the right by EA-EB bits Add MA+MB and use as the new mantissa Adjust the new exponent up or down to re-normalize the result. Essential Math for Games

Essential Math for Games
FP Add Notes Not a simple process (even for pos #’s) If A>>B, then B can be shifted to 0 Repeatedly adding small numbers to an accumulator (i.e. A+=B) can gradually lead to huge error At some point, A stops growing, no matter how many times B is added! Essential Math for Games

Fun with Floats – The Real World
Can’t discuss every FP issue here: this is an example of why you should care 3D engine saw a spike in basic FP code Code (SLERP) was +,-,* only No loops, no complex functions What could be the problem? Essential Math for Games

Breaking Down the Problem
Input values looked valid (no NaN) After a “while”, in a demo, the spike hit We saved the values, along with timing In a small app, ran the slow and fast cases in tight loops The slow cases all had some tiny numbers (~1.0x10-43) Essential Math for Games

Essential Math for Games
Tiny ≠ 0.0f Slow cases were denormals We assumed these numbers to be 0 But FPU was taking care to be accurate Denormal ops seemed to be slow Essential Math for Games

Essential Math for Games
Denormal Performance Did some more timing tests Even loading a denormal was slow! Pentium takes a big hit on denormals True even with exceptions masked! FPU pipeline gets flushed on denormals Little things matter Essential Math for Games

“Don’t Doubles Solve this?”
Doubles do help with most range and many precision problems. But: Need twice the memory of floats (duh) Frequently, significantly slower than floats Some platforms don’t support them Avoid switching to them without tracking down the problem first Essential Math for Games

Floating Point Wrap-up
Floats ≠ Reals Understand the limits of FP Analyze your FP issues Don’t just jump to doubles at the first sign of trouble Be willing to rework your math functions to be FP-friendly Essential Math for Games