8.15 Error-Correcting Code

INTRODUCTION

In contrast to the last section, there is no connotation of secretiveness to the word code as used in this section. We are going to examine briefly the concept of digital communication, say, communication between a satellite and a computer. As a consequence, we will deal only with matrices whose entries are binary digits, namely, 0 s and 1s. When adding and multiplying such matrices, we will use arithmetic modulo 2. This arithmetic is defined by the addition and multiplication tables

Fundamental properties such as the commutative and associative laws hold in this system. The only notable exception here is that 1 + 1 = 0.

Binary Strings

In digital communication the messages or words are binary n-tuples; that is, n-tuples consisting of only 0 s and 1s or bits. An n-bit word is also said to be a binary string of length n.

EXAMPLE 1 Binary Strings

(a) The ordered 4-tuple (0, 1, 0, 1) is a 4-bit word or a string of length four.

(b) The binary (that is, base 2) representation of the number 39 is 1 0 0 1 1 1 or as a 6-tuple (1, 0, 0, 1, 1, 1).

(c) The ASCII* word for the letter Z is the string of length eight: (1, 0, 0, 1, 1, 0, 1, 0).

For convenience a word of length n will be written as a 1 × n matrix, that is, as a row vector. For example, the 4-bit word in Example 1 would be written as the 1 × 4 matrix W = (0 1 0 1).

Codes

By encoding a message, we mean a process whereby we transform a word W of length n into another word C of length n + m by augmenting W with m additional bits, called parity check bits. An encoded word is said to be a code word. By decoding a received message we mean another process that gives either the decoded message or an indication that an error has occurred during transmission. An encoding/decoding scheme is called a code.

One of the simplest codes is the parity check code. In this code a word is encoded according to the rule:

(1)

The word parity refers to whether the number of ones in a word is even or odd. The encoding rule given in (1) makes the parity of the code word always even.

EXAMPLE 2 Encoding Words

Use the parity check code to encode the words:

(a) W = (1 0 0 0 1 1)

(b) W = (1 1 1 0 0 1).

SOLUTION

(a) Since the number of ones in W is odd, we add the extra bit 1 to the end of the word W. The code word is then C = (1 0 0 0 1 1 1).

(b) In this case the number of ones is even, so the extra bit added to the word is 0. The encoded word is C = (1 1 1 0 0 1 0).

In digital communication it is the encoded word C that is transmitted. But due to some kind of interference or noise in the transmission channel, one or more of the bits of C may be changed. Thus, the message transmitted is not always the message received. See FIGURE 8.15.1.

A flow diagram consists of 4 levels. From left to right the diagram is as follows. Level 1, message. A horizontal arrow pointing to level 2 is labeled encoding. Level 2, code word. A horizontal arrow pointing to level 3 is labeled transmission. A vertical oscillating line at the right end of the arrow is labeled noise. Level 3, received message. A horizontal arrow pointing to level 4 is labeled decoding. Level 4, decoded message or error indication.

FIGURE 8.15.1 Bits of encoded word may be changed by interference

The parity check code enables the decoder to detect single errors. Suppose R is the message received. A single error in R means that one bit has been changed; either a zero has been changed to a one or a one has been changed to a zero. In either circumstance the parity of the word R is odd.

EXAMPLE 3 Decoding Words

Use the parity check code to decode the words:

(a) R = (1 1 0 0 1 0 1)

(b) R = (1 0 1 1 0 0 0).

SOLUTION

(a) The parity of R is even. We drop the last bit and take the decoded message to be (1 1 0 0 1 0).

(b) The parity of R is odd. The decoding is simple: a parity error.

For some types of digital communication, such as communication internal to a computer, the parity check code is adequate. But Example 2 clearly indicates one major pitfall of the code: If an error occurs, we do not know how to correct it since we do not know which bit is incorrect. Moreover, multiple errors could occur in transmission. If, say, two ones were changed to zeros in transmission, the message received still has even parity and the decoding takes place by dropping the last bit. In this case at least one of the bits in the decoded message is in error.

Hamming Codes

The parity check code is an example of an error-detecting, but not an error-correcting, code. For the remainder of this discussion we will consider an error-detecting/correcting code called the Hamming (7, 4) code. This code, one of a widely used set of codes invented by the American mathematician Richard W. Hamming (1915–1998) at Bell Laboratories in the 1950s, is an encoding/decoding scheme that can detect the presence of a single error in a received message and can tell which bit must be corrected. In the (7, 4) code the encoding process consists of transforming a 4-bit word W = (w1 w2 w3 w4) into a 7-bit code word

C = (c1 c2 w1 c3 w2 w3 w4),

where c1, c2, and c3 denote the parity check bits. (Words longer than four bits can be broken up into sequences of words of length four.)

Encoding

In the Hamming (7, 4) code, the parity check bits c1, c2, and c3 are defined in terms of the information bits w1, w2, w3, and w4:

(2)

where the arithmetic is carried out modulo 2. Using matrices, we can write (2) as the product

(3)

EXAMPLE 4 Encoding a Word

Encode the word W = (1 0 1 1).

SOLUTION

From (3) we have, with w1 = 1, w2 = 0, w3 = 1, and w4 = 1:

From this product we see that c1 = 0, c2 = 1, c3 = 0, and so the corresponding code word is C = (0 1 1 0 0 1 1).

Before launching into the details of how to decode a message we need to introduce a special matrix. We first observe that in modulo 2 arithmetic there are no negative numbers; the additive inverse of 1 is 1, not −1. With this in mind, we can write the system (2) in the equivalent form

(4)

These are called parity check equations. This means that each ci is a parity check on three of the digits in the original word. For instance, if the number of ones in the three digits w2, w3, and w4 is odd, then, as in the parity check code discussed previously, we would take c1 = 1 and so on. As a matrix product, (4) can be written

(5)

The 3 × 7 matrix in (5),

is called the parity check matrix. We have shown in (5) that the binary digits of a code word C = (c1 c2 w1 c3 w2 w3 w4) satisfy the matrix equation

HCT = 0. (6)

A closer inspection of H shows a surprising fact: The columns of H, left to right, are the numbers 1 through 7 written in binary. For example, by writing the column as 1 1 0, we recognize the binary representation of the number 6.

Let R be a 1 × 7 matrix representing the received message. Since H is a 3 × 7 matrix and RT is a 7 × 1 matrix, the product

S = HRT (7)

is a 3 × 1 matrix called the syndrome of R.

Decoding

If the syndrome of the received message R is

S = HRT = 0,

then, in view of the result in (6), we conclude that R is a code word, and it is assumed that the transmission is correct and that R is the same as the original encoded message C. The decoding of the message is accomplished by simply dropping the three check bits in R.

EXAMPLE 5 Syndromes

Compute the syndrome of:

(a) R = (1 1 0 1 0 0 1)

(b) R = (1 0 0 1 0 1 0).

SOLUTION

(a) From (7) we have

We conclude that R is a code word. By dropping the blue check bits in (1 1 0 1 0 0 1), we get the decoded message (0 0 0 1).

(b) From (7), S =

Since S0, the received message R is not a code word.

As mentioned earlier, the Hamming (7, 4) code enables us to detect and also to correct a single error in the message R. Now let C be a code word and let E = (e1 e2 e3 e4 e5 e6 e7) be a single-error noise word added to C during its transmission. The entries of E are defined by

.

The received message is then R = C + E. From the property RT = CT + ET and the distributive law, we see that the syndrome of R is the same as the syndrome of E:

HRT = H(CT + ET) = HCT + HET = 0 + HET = HET.

From the definition of matrix addition, it is a straightforward matter to verify that the syndrome of E

HET =

can be written as the sum of the column vectors of H with coefficients the symbols that denote the bits where the error might occur:

(8)

Now consider the set of 3 × 1 column vectors whose entries are binary digits. Since there are only two ways to select each of the three entries, there are 23 = 8 such vectors. The seven nonzero vectors are the columns of H or the column vectors displayed in (8). The syndrome S of the received message R is a 3 × 1 column vector with binary entries; hence, if S0, then S must be one of the columns of H. If R contains a single error, then S0 and, since the entries of E are all zero except for one entry of one, we see from (8) that the syndrome itself indicates which bit is in error. In practice there is no need to write out (8); just compute the syndrome S of the received message R. S is a column of H and consequently is the binary number of the bit that is in error.

EXAMPLE 6 Decoding a Word

In part (b) of Example 5 we saw that the syndrome of the message R = (1 0 0 1 0 1 0) was S = . This is the third column of H (or the binary representation of the number 3) and so we conclude that the third bit in R is incorrect. Changing zero to one gives the code word C = (1 0 1 1 0 1 0). Hence by dropping the first, second, and fourth bits from C we arrive at the decoded message (1 0 1 0).

In these brief descriptions of cryptography and coding theory we have not even begun to scratch the surface of these fascinating subjects. Our goal was a modest one: to show how matrix theory is a natural working tool in various areas of mathematics and computer science.

REMARKS

The Hamming (7, 4) code can detect but not correct any pair of errors. Students interested in how this is done or in further details of coding theory should check their library for more specialized texts.

8.15 Exercises Answers to selected odd-numbered problems begin on page ANS-21.

In Problems 1–6, encode the given word using the parity check code.

  1. (0 1 1)
  2. (1 1 1)
  3. (0 0 0 1)
  4. (1 0 1 0)
  5. (1 0 1 0 1 0 0)
  6. (0 1 1 0 1 0 1)

In Problems 7–12, decode the given message using the parity check code.

  1. (1 0 0 1)
  2. (0 0 1 1)
  3. (1 1 1 0 0)
  4. (1 0 1 0 0)
  5. (1 0 0 1 1 1)
  6. (1 0 0 1 0 1)

In Problems 13–18, encode the given word using the Hamming (7, 4) code.

  1. (1 1 1 0)
  2. (0 0 1 1)
  3. (0 1 0 1)
  4. (0 0 0 1)
  5. (0 1 1 0)
  6. (1 1 0 0)

In Problems 19–28, determine whether the given message is a code word in the Hamming (7, 4) code. If it is, decode it. If it is not, correct the single error and decode the corrected message.

  1. (0 0 0 0 0 0 0)
  2. (1 1 0 0 0 0 0)
  3. (1 1 0 1 1 0 1)
  4. (0 1 0 1 0 1 0)
  5. (1 1 1 1 1 1 1)
  6. (1 1 0 0 1 1 0)
  7. (0 1 1 1 0 0 1)
  8. (1 0 0 1 0 0 1)
  9. (1 0 1 1 0 1 1)
  10. (0 0 1 0 0 1 1)
  11. (a) Determine the total number of 7-tuples with binary entries.

    (b) How many 7-tuple code words are there in the Hamming (7, 4) code?

    (c) List all code words in the Hamming (7, 4) code.

    1. In the Hamming (8, 4) code a word

      W = (w1 w2 w3 w4)

      of length four is transformed into a code word of length eight:

      C = (c1 c2 c3 w1 c4 w2 w3 w4),

      where the parity check equations are

      Encode the word (0 1 1 0).

    2. From the system in part (a), determine the parity check matrix H.
    3. Using the matrix H from part (b), compute the syndrome S of the received message

      R = (0 0 1 1 1 1 0 0).

 

*American Standard Code for Information Interchange.