Common Integer formats
The details of fixed-point and floating-point formats require an understanding of sign-magnitude, two’s complement, and offset-binary signed number representations.
This is part 1 of the onboarding floating point series. This series is intended to be used for onboarding of programmers new to the team to review a basic understanding of fixed-point and floating-point number formats, or for programmers who would like to remove some of the mystery from formats they may use everyday.
There are three common signed integer formats values to be familiar with in order to understand fixed-point and floating-point formats.
Sign-magnitude
In a sign-magnitude format, the sign of the value is stored independently of the unsigned magnitude of the value. For example, an 8-bit format:
|-------|-----------------------------------------|
| sign | magnitude |
|-------|-----------------------------------------|
| 1 bit | 7 bits |
|-------|-----------------------------------------|
In this format, magnitude can store values from 0-127 (7 bits). Conventionally, sign is 1 for negative values and 0 for positive values. Values stored in sign-magnitude have two representations for zero (+0/-0).
|-----------------------------------|
| sign-magnitude (s:1 m:7) |
|------|----------|---|-----|-------|
| hex | bin | s | m | value |
|------|----------|---|-----|-------|
| 0x00 | 00000000 | 0 | 0 | 0 |
| 0x80 | 10000000 | 1 | 0 | -0 |
| 0x01 | 00000001 | 0 | 1 | 1 |
| 0x81 | 10000001 | 1 | 1 | -1 |
| 0x7f | 01111111 | 0 | 127 | 127 |
| 0xff | 11111111 | 1 | 127 | -127 |
|------|----------|---|-----|-------|
Table of 8-bit sign-magnitude values
https://godbolt.org/z/scxTKPjdc
Two’s-complement
In two's-complement format, negative values are stored as the bitwise complement of one less than the magnitude. Like sign-magnitude, the top bit can be used to determine the sign. There is only one representation for zero.
|--------------------------------------------------|
| two's-complement (8 bit) |
| s = -1 (negative), 0 (positive) |
| m = magnitude |
|------|----------|-------|------|-----------------|
| hex | bin | value | s | (m+s)^s |
|------|----------|-------|------|-----------------|
| 0x00 | 00000000 | 0 | 0x00 | ( 0+0x00)^0x00 |
| 0x80 | 10000000 | -128 | 0xff | (128+0xff)^0xff |
| 0x01 | 00000001 | 1 | 0x00 | ( 1+0x00)^0x00 |
| 0x81 | 10000001 | -127 | 0xff | (127+0xff)^0xff |
| 0x7f | 01111111 | 127 | 0x00 | (127+0x00)^0x00 |
| 0xff | 11111111 | -1 | 0xff | ( 1+0xff)^0xff |
|------|----------|-------|------|-----------------|
Table of 8-bit two's-complement values
https://godbolt.org/z/PsW5KhzoG
Two’s-complement is the most common format for storing integers and is likely the native integer format for whatever CPU you are using (e.g. x64, ARM.) So if the format is unspecified, it’s probably two’s-complement.
Offset-binary
In offset-binary format, values are stored as unsigned by adding an offset (bias) to all values. The bias must be given as part of the format definition. For example, the following 8-bit offset-binary values with a bias of 128. (Which may be called offset-128.) There is only one representation for zero.
|-----------------------------------|
| offset-binary (8-bit, bias=128) |
| u = 8-bit unsigned value |
|------|----------|-------|---------|
| hex | bin | value | u-128 |
|------|----------|-------|---------|
| 0x00 | 00000000 | -128 | 0-128 |
| 0x80 | 10000000 | 0 | 128-128 |
| 0x01 | 00000001 | -127 | 1-128 |
| 0x81 | 10000001 | 1 | 129-128 |
| 0x7f | 01111111 | -1 | 127-128 |
| 0xff | 11111111 | 127 | 255-128 |
|------|----------|-------|---------|
Table of 8-bit offset-binary (bias=128) values
https://godbolt.org/z/czqh8q8xE
Note that offset-128 is the same as the two’s-complement format, except that the sign bit is inverted.
EXERCISE 1-1: Compare the difference in values if the bias is odd (e.g. 127) or even (e.g. 128).
Conversion
// int8 from sign-magnitude:7
int8_t
int8_from_sm7( uint8_t sm7 )
{
// extract sign and magnitude
int8_t s = (int8_t)sm7 >> 7;
int8_t m = sm7 & 0x7f;
// convert to two's-complement
return (m + s)^s;
}
// int8 from sign-magnitude:7
uint8_t
offset128_from_sm7( uint8_t sm7 )
{
// extract sign and magnitude
int8_t s = (int8_t)sm7 >> 7;
int8_t m = sm7 & 0x7f;
// convert to two's-complement, then offset-128
return ((m + s)^s)^0x80;
}
Table of conversions from 8-bit sign-magnitude
https://godbolt.org/z/1GdocdEx6
uint8_t
sm7_from_int8( int8_t i8 )
{
int8_t s = i8 >> 7;
uint8_t m = (i8 ^ s)-s;
return (s & 0x80)|m;
}
uint8_t
offset128_from_int8( int8_t i8 )
{
return i8 ^ 0x80;
}
Table of conversions from 8-bit two's-complement
https://godbolt.org/z/M6szns3es
// int8 from binary-offset with bias 128
int8_t
int8_from_offset128( uint8_t offset128 )
{
return offset128 ^ 0x80;
}
uint8_t
sm7_from_offset128( uint8_t offset128 )
{
int8_t s = (int8_t)(offset128^0x80) >> 7;
uint8_t m = (offset128 ^ s)-s;
return (s & 0x80)|m;
}
Table of conversions from 8-bit offset-128
https://godbolt.org/z/dzfPafoY3
EXERCISE 1-2: Build a table of all values of a 16 bit signed integer that compare sign-magnitude, two's-complement, and offset-binary.
Take-away
The details of fixed-point and floating-point formats require an understanding of sign-magnitude, two’s complement, and offset-binary signed number representations.
Next: Part 2
Fixed-point as fractions - Fixed-point formats describe fractions where both numerator and denominator are integers.