• Integers
    • Unsigned
    • Signed
  • Fixed point
  • Floating point

Binary representation of integers

Important

In case of overflow the most significant bits are lost

Formulas

1\underbrace{0\ldots0_{(2)}}_n=2^n\\ \underbrace{1\ldots1}_n=1\underbrace{0\ldots0}_{n}-1=2^n-1\\ 0,\underbrace{0\ldots0}_{n-1}1=2^{-n}\\ 0,\underbrace{1\ldots1}_{n}=1,\underbrace{0\ldots0}_n-0,\underbrace{0\ldots1}_{n-1}=1-2^{-n} \end{gathered}

Important

Direct codes are the same as inverse and complementary codes for positive numbers, they differ only for negative numbers

The bits in the leftmost square are the sign bits

=

Important

  • The formula for inverse codes (for negative numbers) is as follows , or just flip all the bits of the direct code
  • The formula for the complementary code (for negative numbers) is as follows , or

Important

You cannot obtain the absolute value from the complementary code To convert from complement form to direct form you have to subtract 1 and flip all the bits (basically you convert to inverse form and then to direct form), or just flip the bits until the least significant active bit (without flipping that last one)

Subunitary convention

(same rules as for integers apply for inverse and complementary form)

Fixed point representation

I is the number of integer bits, F is the number of fractional bits and we also have a sign bit The minimum absolute value is and the maximum absolute value is

Floating point representation of real numbers

Important

Any real number can be written as , where is mantissa, is the numeration base and is exponent

IEEE 754 standard

is the number of digits in the whole part (in binary)

Exercise