#### Discover more from AltDevArts

# Floating-point Multiplication

### Floating-point values are fractions. Multiplying floating-point numbers is multiplying fractions.

*This is part 10 of the onboarding floating point series. This series is intended to be used for onboarding of programmers new to the team to review a basic understanding of fixed-point and floating-point number formats, or for programmers who would like to remove some of the mystery from formats they may use everyday.*

We’ll continue using the s10e5 floating-point format from floating point as fractions.

We can express s10e5 as a fraction:

Except where exp15 is zero:

```
TERMS
-----------------------------------------------------------------------------
| sign | x15 | m |
|-------|---------------------|---------------------------------------------|
| 1 bit | 5 bits | 10 bits |
-----------------------------------------------------------------------------
| sign | 1 bit value exactly as stored (1=negative, 0=positive) |
| x15 | 5 bit value exactly as stored as offset-15 |
| m | 10 bit value exactly as stored as unsigned integer |
| x | exponent as signed integer |
| s | sign as signed integer (-1=negative, 1=positive) |
| magnitude | unsigned value as f(x15,m) |
```

## Multiplication

If we rearrange the fractions, we can define s10e5 floating-point as:

Except for where exp-15 is zero:

Multiplying two values (c = a*b):

And if we replace:

We can re-write the result as:

We can extract the mantissa as with addition. And like addition, at this point the fraction may need to be normalized if significand_c is too large or too small for mantissa to fit in 10 bits unsigned.

Which gives us the result in the correct form:

Except in the case where x15 is zero:

Then normalize.

```
uint32_t
mul(uint16_t a, uint16_t b)
{
//
// Extract components of a,b where:
// s = negative=-1, positive=0
// m = 10 bit unsigned mantissa
// x15 = 5 bit exponent in offset-15
// x15nz = (x15 != 0)?-1:0
// x = if (x15 == 0) -14
// else x15 as two's-complement
//
uint16_t a_s = (int16_t)a >> 15;
uint16_t a_m = a & 0x03ff;
uint16_t a_x15 = (a & 0x7c00) >> 10;
uint16_t a_x15nz = (int16_t)(-a_x15)>>15;
int16_t a_x = (a_x15nz & (a_x15-15)) | ((~a_x15nz) & -14);
uint16_t b_s = (int16_t)b >> 15;
uint16_t b_m = b & 0x03ff;
uint16_t b_x15 = (b & 0x7c00) >> 10;
uint16_t b_x15nz = (int16_t)(-b_x15)>>15;
int16_t b_x = (b_x15nz & (b_x15-15)) | ((~b_x15nz) & -14);
//
// Create significand of a,b where:
// if (x15 != 0) significand = 1024 + m
// else significand = m
//
uint16_t a_significand = a_m + (a_x15nz&1024);
uint16_t b_significand = b_m + (b_x15nz&1024);
//
// Multiply
// - 10 bit fixed-point multiply significands as (a*b)>>10
// - sign(a) * sign(b)
// - half round up
// - defer shift-right 10 bits until normalization
// in case left-shift is needed, so we don't lose
// the values.
//
uint16_t c_s = a_s ^ b_s;
uint32_t c_significand = a_significand * b_significand;
int16_t c_x = a_x + b_x;
//
// Normalize
//
// clz(c_significand) = (0,32)
uint32_t signz = (int32_t)(-c_significand) >> 31;
int16_t sigclz31 = __builtin_clz( c_significand );
int16_t sigclz = (signz & sigclz31)|((~signz)&32);
// reduce by exponent until 10 bits of mantissa
int16_t hi_sa = sigclz-11;
int16_t norm_x = c_x - hi_sa;
uint16_t norm_x_underflow = (norm_x+14) >> 15;
uint16_t norm_x15 = ((~norm_x_underflow) & (norm_x+15));
int16_t norm_sa = ((~norm_x_underflow) & (hi_sa-10))
| (norm_x_underflow & (c_x+4));
// if norm_sa < 0, then right shift, else left shift.
uint16_t norm_sa_sign = norm_sa >> 15;
int16_t norm_rsa = norm_sa_sign & (-norm_sa);
int16_t norm_lsa = (~norm_sa_sign) & norm_sa;
uint16_t norm_significand = (c_significand << norm_lsa) >> norm_rsa;
//
// Store
//
uint16_t c_m = norm_significand & 0x3ff;
uint16_t c = (c_s << 15) | (norm_x15 << 10) | c_m;
return c;
}
```

`Exercise 10-1: Create a version of mul() for s23e8 floating-point. Compare results to using float instructions.`

`REFERENCE: See Berkely SoftFloat [github.com/ucb-bar] for implementations of common operations for different floating-point formats.`

## Normalization as compression

The multiplication of significands is a 10 bit fixed-point multiply. Consider the result as a 20 bit value. As covered in floating-point as compression, you can also think of the normalization process as a pattern to compress that 20 bit result into a 10 bit mantissa and change to the exponent.

```
| significand (20 bit) | m | change to x |
|-----------------------|------------|-------------|
| 1AAAAAAAAAAxxxxxxxxx | AAAAAAAAAA | x+9 |
| 01AAAAAAAAAAxxxxxxxx | AAAAAAAAAA | x+8 |
| 001AAAAAAAAAAxxxxxxx | AAAAAAAAAA | x+7 |
| 0001AAAAAAAAAAxxxxxx | AAAAAAAAAA | x+6 |
| 00001AAAAAAAAAAxxxxx | AAAAAAAAAA | x+5 |
| 000001AAAAAAAAAAxxxx | AAAAAAAAAA | x+4 |
| 0000001AAAAAAAAAAxxx | AAAAAAAAAA | x+3 |
| 00000001AAAAAAAAAAxx | AAAAAAAAAA | x+2 |
| 000000001AAAAAAAAAAx | AAAAAAAAAA | x+1 |
| 0000000001AAAAAAAAAA | AAAAAAAAAA | x |
| 00000000001AAAAAAAAA | AAAAAAAAA0 | x-1 |
| 000000000001AAAAAAAA | AAAAAAAA00 | x-2 |
| 0000000000001AAAAAAA | AAAAAAA000 | x-3 |
| 00000000000001AAAAAA | AAAAAA0000 | x-4 |
| 000000000000001AAAAA | AAAAA00000 | x-5 |
| 0000000000000001AAAA | AAAA000000 | x-6 |
| 00000000000000001AAA | AAA0000000 | x-7 |
| 000000000000000001AA | AA00000000 | x-8 |
| 0000000000000000001A | A000000000 | x-9 |
```

## Conversion to float

For a comparison with outputs of compiler-generated float conversions, see:

https://godbolt.org/z/E4aderGfG

## Next: Part 11

Floating-point Division - Floating-point values are fractions. Dividing floating-point numbers is dividing fractions.