What we have called Coding Theory, should more properly be called the Theory of Error-Correcting Codes, since there is another aspect of Coding Theory which is older and deals with the creation and decoding of secret messages. This field is called Cryptography and we will not be interested in it. Rather, the problem that we wish to address deals with the difficulties inherent with the transmission of messages. More particularly, suppose that we wished to transmit a message and knew that in the process of transmission there would be some altering of the message, due to weak signals, sporadic electrical bursts and other naturally occuring noise that creeps into the transmission medium. The problem is to insure that the intended message is obtainable from whatever is actually received. One simple approach to this problem is what is called a repeat code. For instance, if we wanted to send the message BAD NEWS, we could repeat each letter a certain number of times and send, say,

BBBBBAAAAADDDDD NNNNNEEEEEWWWWWSSSSS.Even if a number of these letters got garbled in transmission, the intended message could be recovered from a received message that might look like

BBBEBFAAAADGDDD . MNNNTEEEEEWWWSWRRSSS,by a process called

Before leaving the repeat codes to look at other coding schemes, let us introduce some
terminology. Each block of repeated symbols is called a *Code word*, i.e., a code word is what
is transmitted in place of one piece of information in the original message. The set of all code
words is called a *Code*. If all the code words in a code have the same length, then the code is
called a *Block code*. The repeat codes are block codes. One feature that a useful code must
have is the ability to detect errors. The repeat code with code words having length 5 can
always detect from 1 to 4 errors made in the transmission of a code word, since any 5 letter
word composed of more than one letter is not a code word. However, it is possible for 5
errors to go undetected (how?). We would say that this code is *4-error detecting*. Another
feature is the ability to correct errors, i.e., being able to decode the correct information from
the error riddled received words. The repeat code we are dealing with can always correct 1 or
2 errors, but may decode a word with 3 or more errors incorrectly, so it is a *2-error
correcting code*.

First of all we shall restrict our horizons and only consider block codes, so all code words will have the same length. Secondly, we will assume that the alphabet used to create our code words consists only of 0 and 1. This last restriction is not as limiting as it appears, after all a computer's word handling abilities rest ultimately on strings of 0's and 1's. We are concerned then with binary block codes. The words (code words and others) that we are dealling with are thus ordered n-tuples of 0's and 1's, where n is the length of the words. These can be viewed abstractly as elements of an n-dimensional vector space over GF(2).

The *Hamming distance* between two words is the number of places in which they differ.
So, for example, the words (0,0,1,1,1,0) and (1,0,1,1,0,0) would have a Hamming distance of
2. This Hamming distance is a metric on the vector space, i.e., if d(x,y) denotes the Hamming
distance between vectors x and y, then d satisfies:

- d(x,x) = 0
- d(x,y) = d(y,x), and
- d(x,y) + d(y,z) >= d(x,z)

The *minimum distance* of a code C is the smallest distance between any pair of distinct
code words in C (assuming that C is finite). It is the minimum distance of a code that
measures a code's error correcting capabilities. If the minimum distance of a code C is 2e +
1, then C is an e-error correcting code, since if e or fewer errors are made in a code word,
the resulting word is closer to the original code word than it is to any other code word and so
can be correctly decoded.

The *weight* of a word is the number of non-zero components in the vector. Alternatively,
the weight is the distance of the word from the zero vector. Examining the weights of the
code words sometimes gives useful information about a particular code.

An important class of codes are the *linear codes*, these codes are the ones whose code
words form a sub-vector space. If the vector space of all words is n dimensional and the
subspace is k dimensional then we talk about the subspace as an (n,k)-linear code.

In general, finding the minimum distance of a code requires comparing every pair of distinct elements. For a linear code however this is not necessary.

**Proposition VI.1.1** -* In a linear code the minimum distance is equal to the minimal weight
among all non-zero code words*.

*Proof:* Let x and y be code words in the code C, then x - y is in C since C is linear. We then
have d(x,y) = d(x-y,0) which is the weight of x-y.

We shall now look at two ways of describing a linear code C. The first is given by a
*generator matrix* G which has as its rows a set of basis vectors of the linear subspace C.
Since the property we are most interested in is possible error-correction and this property is
not changed if in all code words we interchange two symbols (e.g. the first and second letter
of each code word) we shall call two codes equivalent if one can be obtained by applying a
fixed permutation of symbols to the words of the other code. With this in mind we see that
for every linear code there is an equivalent code which has a generator matrix of the form G
= [I_{k} P], where I_{k} is the k by k identity matrix and P is a k by n-k matrix. We call this the
*standard form of G*.

We now come to the second description of a linear code C. The orthogonal complement
of C, i.e. the set of all vectors which are orthogonal to every vector in C [orthogonal = inner
product is 0], is a subspace and thus another code called the *dual code* of C, and denoted by
C' . If H is a generator matrix for C' then H is called a* parity check* matrix for C. In general
a parity check for the code C is a vector x which is orthogonal to all code words of C and we
shall call any matrix H a parity check matrix if the rows of H generate the dual code of C.
Therefore, a code C is defined by such a parity check matrix H as follows:

C = { x | xH^{t} = 0 }.

Let us consider an example. Let C be the (7,4)-linear code generated by the rows of G:

1 0 0 0 1 1 0 G = 0 1 0 0 0 1 1 0 0 1 0 1 1 1 0 0 0 1 1 0 1We get the 16 code words by multiplying G on the left by the 16 different column vectors of length 4 over GF(2). They are:

0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 1 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 1 0 1 0 0 0 1 1 1 1 1 1 1 1 0 0 1 0 1 1 1 1 0 0 1 0 1 1 1 1 0 0 1 0 1 1 1 1 0 0 1 0 0 1 1 1 0 0 1 1 0 1 1 1 0 0 0 1 0 1 1 1 0Notice that there are 7 code words of weight 3, 7 of weight 4, 1 of weight 7 and 1 of weight 0. Since this is a linear code, the minimum distance of this code is 3 and so it is a 1-error correcting code.

A parity check matrix for this code is given by

1 0 1 1 1 0 0 H = 1 1 1 0 0 1 0 0 1 1 1 0 0 1[Verify this].

This code is generally known as the *(7,4)-Hamming Code* being one of a series of linear
codes due to Hamming and Golay.

To examine the process of using codes we shall look at a real application. The Mariner 9 was a space probe whose mission was to fly by Mars and transmit pictures back to Earth. The black and white camera aboard the Mariner 9 took the pictures, and a fine grid was then placed over the picture and for each square of the grid the degree of blackness is measured on a scale from 0 to 63. These numbers, expressed in binary are the data that is transmitted to Earth (more precisely to the Jet Propulsion Laboratory of the California Institute of Technology in Pasadena). On arrival the signal is very weak and it must be amplified. Noise from space added to the signal and thermal noise from the amplifier have the effect that it happens occasionally that a signal trasmitted as a 1 is interpreted by the receiver as a 0 and vice versa. If the probability that this occurs is 0.05 then by the calculation done in the introduction, if no coding were done approximately 26% of the picture received would be incorrect. Thus, there is clearly a need to code this information with an error correcting code. Now the question is, what code should be used? Any code will increase the size of the data being sent and this creates a problem. The Mariner 9 is a small vehicle and can not carry a huge transmitter, so the transmitted signal had to be directional, but over the long distances involved a directional signal has alignment problems. So, there was a maximum size to how much data could be transmitted at one time (while the transmitter was aligned). This turned out to be about 5 times the size of the original data, so since the data consisted of 6 bits (0,1 - vectors of length 6) the code words could be about 30 bits long. The 5-repeat code was a possibility, having the advantage that it is very easy to implement, but it is only 2-error correcting. An Hadamard code based on an Hadamard matrix of order 32 on the other hand would be 7-error correcting and so worth the added difficulty of implementing it. Using this code, the probability of error in the picture is reduced to only 0.01% (the 5-repeat code would have a probability of error of about 1%).

We now turn our attention to the problems of coding and decoding using an Hadamard code. At first glance, coding doesn't seem to be a problem, after all there are 64 data types and 64 code words - so any arbitrary assignment of data type to code word will work. The problem lies in the fact that the Mariner 9 is small, and this approach would require storing all 64 32-bit code words. It turns out to be more economical, in terms of space and weight, to design hardware that will actually calculate the code words rather than read them out of a stored array. By choosing the Hadamard matrix correctly, the Hadamard code will turn out to be a linear code and so this calculation is simply multiplying the data by the generator matrix of the code. The correct choice for the Hadamard matrix is the one obtained by repeatedly taking the direct product of the order 2 Hadamard matrix. [Prove that such an Hadamard code is linear by induction].

Now consider the decoding problem. A simple scheme for decoding is as follows: A
received signal, i.e. a sequence of 32 zeros and ones, is first changed into its ±1 form (by
changing each 0 to -1). If the result is the vector x and if there are no errors, then xHt, where
H is the original Hadamard matrix, will be a vector with 31 components equal to 0 and one
component equal to either ±32. In the presence of errors these numbers are changed, but if
the number of errors is at most 7 then the values 0 can increase to at most 14 and the value
32 can decrease to no less than 18.Thus the maximal entry in xH^{t} will tell us which row of
H or -H (if it is negative) was transmitted. While this is the actual algorithm used to decode
the Mariner 9 signals, it is a bit slow from the computational point of view (requiring 322
multiplications and the corresponding additions for each code word), so a number of
computational tricks are employed to reduce the actual computation to less than 1/3 of what
the algorithm calls for.

E.R. Berlekamp,* Algebraic Coding Theory*, McGraw-Hill, N.Y. 1968.

I.F. Blake and R.C.Mullin,*An Introduction to Algebraic and Combinatorial Coding Theory*,
Academic Press, N.Y., 1976.

P.J. Cameron and J.H. Van Lint, *Graph Theory, Coding Theory and Block Designs*,
Cambridge University Press, Cambridge, 1975.

W.W. Peterson and E.J. Weldon,Jr.,* Error-Correcting Codes*, MIT Press, Cambridge, 1972.

V.Pless,* Introduction to the Theory of Error-Correcting Codes*, Wiley, New York, 1982.

F.J. MacWilliams and N.J.A. Sloane, *The Theory of Error-Correcting Codes*, North Holland,
Amsterdam, 1977.

* The Mariner 9 mission and the coding theory used in that project are the subjects of,*

J.H. Van Lint, "Coding, decoding and Combinatorics", in *Applications of Combinatorics*, ed.
R.J. Wilson, Shiva, Cheshire, 1982.

E.C. Posner, "Combinatorial Structures in Planetary Reconnaissance" in *Error Correcting
Codes*, ed. H.B. Mann, Wiley, N.Y. 1968.