Floating-point as fractions
Like fixed-point, floating-point is a format which stores rational numbers as fractions.
This is part 6 of the onboarding floating point series. This series is intended to be used for onboarding of programmers new to the team to review a basic understanding of fixed-point and floating-point number formats, or for programmers who would like to remove some of the mystery from formats they may use everyday.
We introduced the s10e5 floating-point format in floating-point as compression. Now let’s look at the same format from a different perspective.
TERMS
-----------------------------------------------------------------------------
| sign | x15 | m |
|-------|---------------------|---------------------------------------------|
| 1 bit | 5 bits | 10 bits |
-----------------------------------------------------------------------------
| sign | 1 bit value exactly as stored (1=negative, 0=positive) |
| x15 | 5 bit value exactly as stored as offset-15 |
| mantissa | 10 bit value exactly as stored as unsigned integer |
| m | mantissa |
| x | exponent as signed integer |
| s | sign as signed integer (-1=negative, 1=positive) |
| magnitude | unsigned value as f(x15,m) |
Power-of-two Exponent
The exponent signifies the log2 or power-of-two of the value. Some fractional value (F) is then added to that.
Power-of-Two Intervals
Since we’re working in powers of two, the distance between some exponent x and the next (x+1), is x. The fractional value (F) is some part (f) of that distance.
Mantissa as Fraction of Interval
The mantissa, occupying 10 bits, is interpreted as with fixed-point.
Special Case Zero
If the exponent in offset-15 is zero, result is a fraction of 1/2^-14
Apply Sign
Apply the sign to create the final result.
Therefore, we can define s10e5 floating point as the fraction:
Except where exp-15 is zero:
EXERCISE 6-1: Create a table of fractions for the exponents of a s23e8 floating point format. Assume sign is top bit, exponent is stored as offset-127, and mantissa is unsigned.
Log-Linear Fractions
As described above, in a floating point format, the range is divided into power-of-two (logarithmic) segments. However, each segment is linearly divided by the resolution of the mantissa.
EXERCISE 6-2: The s10e5 floating-point format described here is in sign-magnitude form. Create a version of the format that uses two's-complement instead of sign-magnitude and functions which convert from one to the other. Note any differences in values which can be represented.
REFERENCE: Check out this implementation of s23e8 floating-point [6502.org] for the 6502 by Roy Rankin and Steve Wozniak.
Next: Part 7
Floating-point common names - Some floating-point formats have standard names (float, double, half)