Cpsc 231 Pascal - Copyright (C) 2001 Katrin Becker 1998-2001 Last Modified September 8, 2005 11:00 AM
Course Notes - Representation of Information

Word size in any given computer is always fixed, so with a 16-bit word, EVERY word (memory location) can hold a 16-bit pattern, where each bit can be a 0 or a 1. How many possible distinct patterns are there in a 16-bit word? Because each bit has 2 possible values, 1 bit has 2 distinct patterns. With 2 bits, each one has 2 possibilities so we get a 0 with a 1 or a 0, and a 1 with a 1 or a 0 (4 possibilities: 2 bits times 2 possibilities for each). With three bits we can have a 0 with all the possibilities for 2 bits or a 1 with all the possibilities for 2 bits --> 2 x 4 possibilities = 8 (or 2 possibilities for each of three bits = 2**3 possibilities). To the 16-bit question, the answer is 2 to the power of 16 (2**16), or 65,536. There are 65,536 possible distinct bit patterns in a 16-bit word, but what those bit patterns mean depends entirely on the context in which that pattern is used. A bit pattern is only a bit pattern, and can be used to represent almost anything. Possibilities include: integer numbers, character strings, real (FLOATING POINT) numbers, logical strings, memory addresses, machine instructions, or ANYTHING ELSE WE WANT. A bit pattern 'becomes' one or another of these things when we direct the computer to interpret it as such. Sometimes this is done automatically because of the context in which the bit pattern is found (like instructions), and other times we have very direct control over a bit pattern's interpretation (as in a programming language's declaration statement which indicates how to interpret the memory space associated with the named variable).
 
Number Systems:
 
General Rules:
x0 = 1;    x1 = x;     x2 = x * x;      x-1 = 1/x;    x-2 = 1/ (x*x);

 

Decimal Defined (and conversions to decimal)
- represented by 10 distinct symbols: 0,1,2,3,4,5,6,7,8,9
- based on powers of 10
- each place to the left of a digit in a string increases by a power of 10; each place to the right of a digit in a string decreases by a power of 10

Example: 4769210 in expanded notation looks like:

= 4 * 104 + 7 * 103 + 6 * 102 + 9 * 101 + 2 * 100

= 4 * 10000 + 7 * 1000 + 6 * 100 * 9 * 10 + 2 * 1

To count in decimal: 0,1,2,...,9, then back to 0 and carry a one into the next (left) column { like on an odometer}. From 10, we go 11, 12, .. to 19, and then back to 0 again in the first column and carry another 1 into the next column, making it a 2 and so on.

 

Binary Defined (and conversions to decimal)
- represented by 2 distinct symbols: 0,1
- based on powers of 2
- each place to the left of a digit in a string increases by a power of 2; each place to the right of a digit in a string decreases by a power of 2

Example: 101110012 in expanded notation looks like:

 

= 1 * 27 + 0 * 26 + 1 * 25 + 1 * 24 + 1 * 23 + 0 * 22 + 0 * 21 + 1 * 20
= 1 * 128 + 0 * 64 + 1 * 32 + 1 * 16 + 1 * 8 + 0 * 4 + 0 * 2 + 1 * 1
= 128 + 32 + 16 + 8 + 1
= 185
 
Counting in binary looks like this: (look for the repeating patterns)
 
0 00 000 16 10 000
1 00 001 17 10 001
2 00 010 18 10 010
3 00 011 19 10 011
4 00 100 20 10 100
5 00 101 21 10 101
6 00 110 22 10 110
 7 00 111 23 10 111
 8 01 000  24 11 000
 9 01 001  25 11 001
 10 01 010  26 11 010
 11 01 011  27 11 011
 12 01 100  28 11 100
 13 01 101  29 11 101
 14 01 110  30 11 110
 15 01 111  31 11 111
 
     
Octal Defined (and conversions to decimal)
- represented by 8 distinct symbols: 0,1,2,3,4,5,6,7
- based on powers of 8
- each place to the left of a digit in a string increases by a power of 8; each place to the right of a digit in a string decreases by a power of 8

Example: 12438 in expanded notation looks like:

 

= 1 * 83 + 2 * 82 + 4 * 81 + 3 * 80
= 1 * 512 + 2 * 64 + 4 * 8 + 3 * 1
= 512 + 128 + 32 + 3
= 675
 
Counting in octal looks like this: (look for the repeating patterns 7 compare the octal digits with the equivalent binary digits)
 
0 00 16 20
1 01 17 21
2 02 18 22
3 03 19 23
4 04 20 24
5 05 21 25
6 06 22 26
 7 07 23 27
 8 10  24  30
 9 11  25  31
 10 12  26  32
 11 13  27  33
 12 14  28  34
 13  15  29  35
 14  16  30  36
 15  17  31  37
 
     
Hexadecimal Defined (and conversions to decimal)
- represented by 16 distinct symbols: 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F
- based on powers of 16
- each place to the left of a digit in a string increases by a power of 16; each place to the right of a digit in a string decreases by a power of 16

Example: 124316 in expanded notation looks like:

= 1 * 163 + 2 * 162 + 4 * 161 + 3 * 160
= 1 * 4096 + 2 * 256 + 4 * 16 + 3 * 1
= 4096 + 512 + 64 + 3
= 4675
 
Counting in hexadecimal looks like this: (look for the repeating patterns & compare the hex digits with the equivalent binary digits)

     

    0 00 16 10
    1 01 17 11
    2 02 18 12
    3 03 19 13
    4 04 20 14
    5 05 21 15
    6 06 22 16
     7 07 23 17
     8 08  24  18
     9 09  25  19
     10 0A  26  1A
     11 0B  27  1B
     12 0C  28  1C
     13  0D  29  1D
     14  0E  30  1E
     15  0F  31  1F
     
Conversions
bin <-> oct
bin <-> hex
 
Look again at the definition for octal numbers and compare the decimal, binary, octal, and hex values against each other.
 
dec. binary  octal hex dec binary octal hex
0 00 000 00 00 16 10 000 20 10
1 00 001 01 01 17 10 001 21 11
2 00 010 02 02 18 10 010 22 12
3 00 011 03 03 19 10 011 23 13
4 00 100 04 04 20 10 100 24 14
5 00 101 05 05 21 10 101 25 15
6 00 110 06 06 22 10 110 26 16
 7 00 111 07 07 23 10 111 27 17
 8 01 000 10 08  24 11 000 30 18
 9 01 001 11 09  25 11 001 31 19
 10 01 010 12 0A  26 11 010 32 1A
 11 01 011 13 0B  27 11 011 33 1B
 12 01 100 14 0C  28 11 100 34 1C
 13 01 101 15 0D  29 11 101 35 1D
 14 01 110 16 0E  30 11 110 36 1E
 15 01 111 17 0F  31 11 111 37 1F
 
When we look at the binary and octal numbers we may notice that each octal digit is represented completely and exactly using 3 binary bits. Thus, we can convert from binary directly to octal by grouping the binary digits into sets of 3 bits and converting each set into one octal digit. Conversely we can convert from octal to binary by writing down the appropriate 3 bit pattern for each octal digit. This can be verified by doing a direct conversion (binary-> octal or octal -> binary) and then converting each separately to decimal.
 
Now, if we look at the binary and hexadecimal numbers we can make a similar discovery, only this time we use 4 binary bits. The rest of the story is the same.
 
Conversion from Decimal
 
Converting from decimal to any other base is simply a matter of repeated division by that base. We must use modulo division (keeping a remainder) because the answer will be found in the remainders.
 
Eg. convert 56710 to octal:
 
567 / 8 = 70 remainder 7
70 / 8 = 8 remainder 6
8 / 8 = 1 remainder 0
1 / 8 = 0 remainder 1 (since the quotient is 0, we are finished)
 
The answer is 10678 (to check, convert back to decimal)
 
ALL conversions from decimal to another base are done the same way. To convert to a different base you simply divide by the desired base.
 
Binary Arithmetic - Addition
 
Remember these rules:
 0
+ 0
0
 0
+ 1
1
  1
+ 0
1
  1
+ 1
10
1
1
+ 1
11
 
Unsigned Numbers
 
All the numbers we have looked at so far have been unsigned numbers. That means strictly positive integers.
 
What about negative values?
 
If we are using a 16-bit word to store an integer (whole number), what is the largest number we can represent? Roughly 2 to the power 16, which is 65,535 because one of the bit patterns is used to represent a zero. If we need to be able to represent negative as well as positive numbers the answer must change, because clearly a -1 must look different from +1 and so some of the 65,536 bit patterns will be used to represent positive numbers, and some will be used for negative numbers.
Adding them we get -----
 
In the design of computer systems, the primary goal is to have the arithmetic and logical circuits (add, shift, etc.) be as simple as possible (and therefore also fast), and to have the numbers be consistent so they can all be treated the same. In the design of integer representations that include negative as well as positive numbers, the HIGH ORDER BIT (left-most in the word) is set to 1 for a negative number and to 0 for a positive number.

Now things get a little more complicated, because just setting the high order bit isn't good enough. To illustrate, let's look at what happens when we try to do arithmetic on these numbers.

Given 2 binary numbers:
Adding them we get ------------- (since subtracting is the same as adding a negative)
 
Clearly this won't work. We must either design new circuits to handle this (not desirable) or represent negative numbers in a different way. Another goal is to have a smooth transition from positive to negative. 1 - 1 should be 0 no matter how we look at it (i.e. 00000001 - 1 = 00000000), and 0 - 1 should be -1 (i.e. 00000000 - 1 = 10000001?).
 
One's Compliment
 
One way to solve the problem is to use what amounts to a 'photograph style' negative representation for negative numbers. We represent negative numbers by negating the positive representation of the number: where-ever we see a 1 we put a 0 and where-ever we see a 0 we put a 1. This works moderately well but there are a few problems.
 
1. -0
2. end-around carry
 
Two's Compliment
 
The problem can also been solved by using a system known as two's compliment representation. Positive numbers are represented the way we expect, and negatives are created by negating the positive value and adding one (otherwise we end up with -0). So, using 2's compliment, -17 becomes 001 111:- negated= 110 000, and add 1 = 110 001. Now if we do the same operation as before:

 001 100 (14[8], 12[10])
+110 001 (-17[8], -15[10])
-----------
 111 101 (to convert we negate ->
 000 010, and add 1 to get
 000 011, which is -3 !)

Conversion Now
 
This system does work, and it is the one most commonly used. How many values can be represented in a 16-bit word using 2's compliment? Still 65,536 but now half are positive (zero is a positive number) and half are negative. So the range of signed integers in a 16-bit word is -32,768 thru 0 to 32,767. Those are the highest and lowest values that can be represented.

 N N to the power 2  N N to the power 2
 0
1
 16
 65536
 1
 2
 17
 131072
 2
 4
 18
 262144
 3
 8
 19
 524288
 4
 16
 20
 1048576
 5
 32
 21
 2097152
 6
 64
 22
 4194304
 7
 128
23 
 8388608
 8
 256
 24
 16777216
 9
 512
 25
 33554432
 10
 1024
 26
 67108864
 11
 2048
 27
 134217728
 12
 4096
 28
 168435456
 13
 8192
 29
 336870912
 14
 16384
 30
 673741824
 15
 32768
 31
 1347483648

It must be accepted that values stored in computers cannot be infinitely large, so let's see what happens when we go beyond the boundaries. This is easier to see using smaller bit strings so we'll pretend we have a word size of 5 bits. 5 bits give us 32 distinct bit patterns. If we use them to represent signed integers, we get a range of -16 to +15. The boundaries and mid- points look like this:
-16 = 10000,
-1 = 11111,
0 = 00000,
+1 = 00001,
+15 = 01111.
 
Adding two large numbers:
01100 (12)
00111 ( 7)
-----------
10011 (this is a negative number because the high bit is 1)

To convert 10011 to it's decimal equivalent, we NEGATE: 01100, add 1 = 01101 to get -15 (base8) or -13 (base10).

Subtracting from a low number:
    10000 (-16)
add negative) 11111 ( -1)
  ------------
  (1)01111 (+15?)

Both of these conditions are referred to as overflow. In the first the result has a different sign from the operands, and this is not arithmetically possible in addition. In the second, as well as the result having a different sign from both operands, there is a carry that has no space to be saved. In other words, for the first example, if we didn't have the sign bit, the result simply wouldn't fit in the word. In the second example the result doesn't fit even with the sign bit.
 
Binary Arithmetic Now
 
 When using 2's compliment, positive values always look the same as 'unsigned values', but negative values are now represented differently. Remember that the word size must be sufficient to hold the operands as well as the result. If it's not you will end up with either overflow, or underflow.
in decimal
in octal
unsigned
to convert
now 'add'
47
57
000 101 111
compliment :
and add 1
000 101 111
-18
-22
-000 010 010
111 101 101
111 101 110
111 101 110
29
35
000 011 101
[1] 000 011 101
the 'extra' 1-bit
doesn't cause a
problem here


in decimal
in octal
unsigned
to convert
now 'add'
convert back
64
100
001 000 000
compliment :
and add 1
001 000 000
compliment:
and add 1
67
-103
-001 000 011
110 111 100
110 111 101
110 111 101
-3
-3
000 000 011
[0] 111 111 101
000 000 010

000 000 011

= -3


What About Fractions?
 
So far we have dealt only with integer values (whole numbers). We still don't have a way to represent really large or really small numbers. Even a 32-bit integer only has the range -2,147,483,648 - 2,147,483,647. What are we to do when we have to work with fractions? The answer this time is to use floating point.
 
 
Floating Point
 
Recall the general form of numbers expressed in scientific notation:
3.145926358
6.02 x 1023
 
These are real numbers. When these numbers are printed out by a computer they usually look like this:
3.1459e0
6.02e23
 
The numbers are represented this way by computer to save space and reduce redundancy. Internally, the number is simply separated into two parts (mantissa and exponent) and the 'e' is left off too. Remember that these numbers are still represented internally as binary. The usual format for a 32-bit real (floating point) number is:
 
 exponent = 11 bits
 mantissa =  21 bits
 
Keep the limits of what values can be represented in 11 and 21 bits in mind as we look at the next part.
 
Both parts of this number are represented as distinct two's compliment values.
 
This set-up works well enough that no-one has successfully replaced it with another, but is has a few problems that are important to know:
 
1. some numbers don't convert exactly to binary
 
We already know how to convert a decimal integer to any other base. Here's how we convert fractions (by repeated multiplication). Let's say we have room in our number for 4 octal digits (it's easier to illustrate using octal than binary, but you already know how easy it is to convert between binary and octal so the jump to binary is almost trivial).
 
let's convert 0.5 (base 10) to octal:
 
0.5 X 8 = 4.0 (for the next step, take the fraction again)
0.0 X 8 = 0.0 (we're done. Obviously nothing will change after this step)
 
The answer comes from the part one the left of the decimal point with the top-most value being the first one after the decimal point. So here the answer is 0.4. You should have guessed this as 5/10 = 4/8.
 
let's try another 0.6875 (base 10) to octal:
 
0.6825 X 8 = 5.46
0.46 X 8 = 3.68
0.68 X 8 = 5.44
0.44 X 8 = 3.52
0.52 X 8 = 4.16
0.16 X 8 = 1.28
0.28 X 8 = 2.24
0.24 X 8 = 1.92
0.92 X 8 = 7.36
0.36 X 8 = 2.88
0.88 X 8 = 7.04
0.04 X 8 = 0.32
0.32 X 8 = 2.56
 
We could probably go on for much longer but we only have room for 4 digits anyway so our answer will have to be 0.5353. We have lost some of the original number (precision). In some applications, loss of precision is not a big problem, but if you were buying gold, would you be happy with this?
 
Increasing the size of the number ( double precision uses two words ) is usually how we get around this problem. Notice it doesn't actually solve the problem, it just makes it smaller.
 
 
2. not all values can be represented
 
This point is best illustrated using a pretend decimal floating point number. Let's say we have a floating point value that looks like this:
 
 exponent = 1 decimal digit
 mantissa =  2 decimal digits
 
Here's a sample number: 74 X 102 (= 7400). Now, what's the very next higher value that we can represent?
Here's a hint: it's 75 X 102 (= 7500). OK, what happened to 7450? and 7401? and 7499? (answer: they can't be represented at all) What then if we are doing arithmetic with these values and our answer is 7429. Again, the short answer is we can represent the answer in one of two ways: as 7400 or as 7500 - you pick. In reality, we can again make this problem less noticeable by making the word size big enough but we can't ever make it go away completely.
 
Other Things Bits Can Be
ASCII
 
Since all values (Everything. No exceptions.) stored in a computer must be in binary format, it follows that CHARACTERS must also be stored that way. When you type at your keyboard, you see characters appearing on your monitor. What is actually happening is that when you press down on a key, a 'signal' is sent to the computer. That 'signal' consists of a series of 8 bits (one byte) which REPRESENT the character you just typed. This same byte is then sent to your monitor, whose purpose it is to receive these bytes representing characters and display them on the screen. The bit pattern of 8 bits (one byte) representing a character is called a character code. There exist several standards for these character codes(such as ASCII, and EBCDIC), but the only one we will discuss is the set of ASCII codes.

ASCII defines a specific code for every key you might type at the terminal. The table at the end of this section shows you the table for all 128 characters. The numbers (values) are octal numbers. Thus the code for the letter 'a' is 141 (01100001 in octal, or 97 in decimal), and the character code for the letter 'A' is 101 (01000001 in octal, or 65 in decimal). Note that upper and lower case letters have different character codes. For convenience (as we will see), all upper case letters are listed first, in order, and all lower case letters are listed later, also continuously in order. Notice that the letter 'I'("capital eye" code 111) is different from the letter 'l'("el': code 154), and both of these are different from the letter (?) '1' (code 061). Also, the letter 0 (zero) is not the same as the capital letter O (oh). The space bar on the keyboard also sends a code - space is not nothing to a computer - it has the character code 040. The space is one of a sub-class of characters that don't leave anything visible on the screen when you are typing. These characters are called non-printing characters (for obvious reasons, I hope). They include all the character codes from NULL (000) to the space(040), and the DELETE character (177). Some of these characters you will already be familiar with, such as the space, the delete, the newline (RETURN key), the backspace, and maybe the tabs and formfeed characters. Most of the others will be new (and largely unnecessary for our purposes).

Since characters are stored as ASCII codes, it would be useful to have the values of those codes follow certain guide-lines. One thing that is often done with groups of stored characters (character strings), is to SORT them into some specific order. One might have a long list of character strings that represent people's names. In that case we would likely want to be able to sort them alphabetically (a very common procedure performed on character strings). In order to make this easy it would be nice if we didn't have to worry about the actual characters, but instead were able to work directly with the corresponding character codes. And, in fact, this is made possible by the way the codes have been assigned to the letters. If we look at the upper case letters, we can see that the code for 'A' (101) is numerically less than the code for 'B' (102), and that the code for 'B' fits very nicely between the codes for 'A' and 'C' (103). The same holds true for the lower case letters. Notice that ALL of the lower case letters are numerically greater than ALL of the upper case letters. This way 'A' is less than 'a' and so on.

These numerical comparisons continue to hold true even when we start stringing characters together. 'AA' (101101) is numerically less than 'AD' (101104), and also less than 'Aa'(101141). What this means is that it is possible to sort these strings in the standard alphabetical manner by comparing the representative bit strings as though they were numbers. Numerical comparisons are quite easy to do both in high and low level languages on the computer (they're also pretty fast). So the problem of how to sort character data is now a fairly trivial one, at least it's no more difficult than sorting numbers by computer. For consistency, numbers have lower character codes than all letters, making 'A1' come before 'AA', and the space character is lower than all the printing characters, so 'A ' comes before 'AA', just as it should be.

Although it is possible to take the numerical equivalents a step farther and actually do arithmetic with them, it usually doesn't make sense to add letters together:- with one exception. Notice that the digits are in proper numerical order, with the 0 starting at 060 (00110000), and the 9 ending at 071 (00111001). Two things are worth mentioning here. It is possible to get the actual numeric value from a given numeric character code simply by subtracting 60 (base 8), or 48 (base 10) from its ASCII code. The other is that if you look at the binary bit pattern, you will notice that the bottom (low) 4 bits of the byte are the actual binary equivalents of the numbers that the codes are representing. This makes converting numerical character strings to their actual numerical values much easier than it would otherwise be.

Out of all of this, there is one idea that cannot be stressed enough. A BIT PATTERN IS JUST A STRING OF 1's and 0's. Its meaning depends entirely on the context in which it is used. The same bit pattern (100011001010101111, for example) can be an integer, an instruction, a character string, or any number of other things. The bit pattern is the same, it is THE INTERPRETATION OF THAT BIT PATTERN THAT CHANGES.
 
The 6 Basic Bit Operations
There are 6 basic operations that can be done on bits. Three are arithmetic: add, subtract (which is actually adding the negative), and shift, and the rest are logical operations: and, or, and not. NOT is the only unary operator, while the others are binary operators. This does not mean that they can only work on binary bits. Binary operators have two operands (as in A + B) while unary operators have only one ( NOT A ). We have already discussed ADD and SUBTRACT. The rest are fairly straight forward. The shift operation is as its name implies. Shifting a value by 1 (to the left) has the following effect: 00111 (7) --> 01110 (14). When working in binary, a shift to the left by one bit causes the value to be multiplied by 2 (just as shifting a decimal value to the left causes it to be multiplied by 10). On the other side, a shift to the right: 01010 (10) --> 00101 (5) causes the value to be divided by 2 (same as a decimal shift right divides the number by 10). If the bit pattern is meant to be a signed integer, there are a few rules that must be followed to maintain the integrity of the value.
1. A shift left that changes the sign results in an arithmetic overflow.
2. The new low bit that is 'added' in must always be '0'.
3. The sign of the result must be the same as the original so for a shift right, the value of the high bit is duplicated for the new high bit. So 10000 (-16) shift right --> 11000 (-8). This is referred to as signed extension.
The logical operations have very specific rules for each.
NOT is a simple negation operation, so each bit in any bit pattern is 'FLIPPED' - 1's become 0's, and 0's become 1's (NOT 11001 --> 00110). AND and OR both require two operands, and the rules are described in the following truth tables:

AND
0
1
0
0
0
1
0
1


OR 
0
1
0
0
1
1
1
1


All logical operations can be built using these two basics, but there are two others that are used so commonly that we will define them as well. They are:

XOR 
0
1
0
0
1
1
1
0


NAND
0
1
0
1
1
1
1
0


All of the logical operations will work with either single bit values or on entire bit patterns. If they are used with bit strings longer than one bit, the result is evaluated bit by bit (BITWISE EVALUATION).
EXAMPLE:

AND OPERATION:

11001010 01000111 11110000
00110101 01110010 10101010
-------- -------- --------
00000000 01000010 10100000

OR OPERATION:
11001010 01000111 11110000
00110101 01110010 10101010
-------- -------- --------
11111111 01110111 11111010

NAND OPERATION:
11001010 01000111 11110000
00110101 01110010 10101010
-------- -------- --------
11111111 10111101 01011111

XOR OPERATION (exclusive OR):
11001010 01000111 11110000
00110101 01110010 10101010
-------- -------- --------
11111111 00110101 01011010
 
Binary Arithmetic Re-Visited
Multiplying
Dividing
 

Cpsc 231 Pascal - Copyright (C) 2001 Katrin Becker 1998-2001 Last Modified September 8, 2005 11:00 AM