===============================
The Big Picture on Bit Patterns
===============================
-Ian! D. Allen - idallen@idallen.ca - www.idallen.com
Bit patterns have no inherent meaning. They may represent signed integers,
unsigned integers, floating point numbers, characters, or even executable
program instructions. The instructions that operate on the bits give the
bits meaning. You write the programs that generate those instructions.
Example: The 32-bit pattern 00111111100000000000000000000000 (3F800000h)
If you interpret this bit pattern as:
1. unsigned integer -> 1065353216 decimal
2. sign/magnitude -> 1065353216 decimal
3. two's complement -> 1065353216 decimal
4. IEEE 754 SP FP -> 1.0 decimal
5. Four 7-bit ASCII characters (in 8-bit bytes):
00111111 = 63 decimal = '?' (question mark character)
10000000 = 128 decimal = NOT ASCII (ASCII is only 0-127 decimal)
00000000 = 0 = NUL (control character - not printable)
00000000 = 0 = NUL (control character - not printable)
6. Four 8-bit, excess-127 integers:
00111111 = 63 --> 63-127 = -64 decimal
10000000 = 128 --> 128-127 = +1 decimal
00000000 = 0 --> 0-127 = -127 decimal
00000000 = 0 --> 0-127 = -127 decimal
Example: The 32-bit pattern 10111111100000000000000000000000 (BF800000h)
If you interpret this bit pattern as:
1. unsigned integer -> 3212836864 decimal
2. sign/magnitude -> -1065353216 decimal
3. two's complement -> -1082130432 decimal
4. IEEE 754 SP FP -> -1.0 decimal
5. Four 7-bit ASCII characters (in 8-bit bytes):
10111111 = 191 decimal = NOT ASCII (ASCII is only 0-127 decimal)
10000000 = 128 decimal = NOT ASCII (ASCII is only 0-127 decimal)
00000000 = 0 = NUL (control character - not printable)
00000000 = 0 = NUL (control character - not printable)
6. Four 8-bit, excess-127 integers:
10111111 = 191 --> 191-127 = +64 decimal
10000000 = 128 --> 128-127 = +1 decimal
00000000 = 0 --> 0-127 = -127 decimal
00000000 = 0 --> 0-127 = -127 decimal
If your program works correctly, you read out of memory the same type
of data as you store into memory, and everything you store has its
own separate memory location so that you don't overwrite anything.
When programs go bad, they may write one type of data into memory and
read it out as a different data type, causing program misbehaviour.
The worst type of overwriting occurs when character or numeric data
overwrites executable instructions, possibly causing the program to
surrender control to an attacker.
*** Numbers
Numbers represented in computers have a limited size (number of bits),
hence limited precision and limited range. Numbers can be stored in
many ways. Two common ways are as integers or as floating-point values.
Both precision and range are essentially the same for integers, since
integers have no exponent field. Floating point numbers have both a
mantissa (for precision) and an exponent field (for range); they usually
trade away some bits of precision in favour of greater range.
Though floating-point numbers almost always have a greater range than
integers, the range is not infinite. To store a value accurately in a
floating-point representation in a computer, two things must work out:
the number's value must lie within the *range* of the floating-point
representation (the exponent must fit), and the value must not lose any
*precision* (the mantissa must fit).
In practice, some precision is often lost when working with floating-point
numbers. As a simple example, the decimal value 0.1 (1/10 or one tenth)
cannot be accurately represented as a binary floating-point number.
No finite sum of powers-of-two will ever equal exactly 0.1, just as no
finite sum of powers-of-ten will ever exactly equal one-third (0.3333...).
If you write a computer program that adds together ten tenths, the result
will not equal 1.0, though it may come very close.
*** Characters
Characters may be represented in computers using any one of many
different standards. Some standards allocate only one byte per character
(e.g. ASCII or Latin-1); other standards always use multi-byte characters
(e.g. Unicode); other standards use single- and multi-byte characters
depending on which character is being encoded (e.g. UTF-8).
*** Summary
Bit patterns have no inherent meaning. They may represent signed integers,
unsigned integers, floating point numbers, characters, or even executable
program instructions. The instructions that operate on the bits give the
bits meaning. You write the programs that generate those instructions.
--
| Ian! D. Allen - idallen@idallen.ca - Ottawa, Ontario, Canada
| Home Page: http://idallen.com/ Contact Improv: http://contactimprov.ca/
| College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/
| Defend digital freedom: http://eff.org/ and have fun: http://fools.ca/