8.10 Orthogonal Matrices

INTRODUCTION

In this section we are going to use some elementary properties of complex numbers. Suppose z = a + ib denotes a complex number, where a and b are real and the symbol i is defined by i2 = −1. If = aib is the conjugate of z, then the equality z = or a + ib = aib implies that b = 0. In other words, if z = , then z is a real number. In addition, it is easily verified that the product of a complex number z and its conjugate is a real number: z = a2 + b2. The magnitude of z is defined to be the real number |z| = . The magnitude of z can be expressed in terms of the product z: |z| = = |z|, or |z|2 = z. A detailed discussion of complex numbers can be found in Section 17.1.

There are many types of special matrices, but two types occur again and again in applications: symmetric matrices (page 380) and orthogonal matrices (page 420). In this section we are going to examine both these matrices in further detail.

Symmetric Matrices

We begin by recalling, in formal terms, the definition of a symmetric matrix.

DEFINITION 8.10.1 Symmetric Matrix

An n × n matrix A is symmetric if A = AT, where AT is the transpose of A.

The proof of the next theorem depends on the properties of complex numbers discussed in the review at the start of this section.

THEOREM 8.10.1 Real Eigenvalues

Let A be a symmetric matrix with real entries. Then the eigenvalues of A are real.

PROOF:

If K is an eigenvector corresponding to an eigenvalue λ of A, then AK = λK. The conjugate of the last equation is

= . (1)

Since the entries of A are real, we have A = , and so (1) is

(2)

We now take the transpose of (2), use the fact that A is symmetric, and multiply the resulting equation on the right by K:

(3)

But if we multiply AK = λK on the left by T, we obtain

(4)

Subtracting (4) from (3) then gives

(5)

Now T is a 1 × n matrix and K is an n × 1 matrix, so the product TK is the 1 × 1 matrix TK = (|k1|2 + |k2|2 + . . . + |kn|2). Since by definition, K0, the last expression is a positive quantity. Therefore we conclude from (5) that λ = 0 or = λ. This implies that λ is a real number.

Inner Product

In Rn the inner product or dot product of two vectors x = (x1, x2, . . ., xn) and y = (y1, y2, . . ., yn) is given by

x · y = x1y1 + x2 y2 + . . . + xn yn. (6)

Now if X and Y are n × 1 column vectors, X = and Y = , then the matrix analogue of (6) is

.* (7)

Of course, for the column vectors given, YT X = XT Y. The norm of a column vector X is given by

THEOREM 8.10.2 Orthogonal Eigenvectors

Let A be an n × n symmetric matrix. Then eigenvectors corresponding to distinct (different) eigenvalues are orthogonal.

PROOF:

Let λ1 and λ2 be two distinct eigenvalues of A corresponding to the eigenvectors K1 and K2, respectively. We wish to show that .

Now by definition we must have

AK1 = λ1K1 and AK2 = λ2K2. (8)

We form the transpose of the first of these equations, use AT = A, and then multiply the result on the right by K2:

(9)

The second equation in (8) is multiplied on the left by

(10)

Subtracting (10) from (9) yields

Since λ1λ2, it follows that .

EXAMPLE 1 Orthogonal Eigenvectors

The eigenvalues of the symmetric matrix A = are λ1 = 0, λ2 = 1, and λ3 = −2. In turn, the corresponding eigenvectors are

Since all the eigenvalues are distinct, it follows from Theorem 8.10.2 that the eigenvectors are orthogonal, that is,

We saw in Example 3 of Section 8.8 that it may not be possible to find n linearly independent eigenvectors for an n × n matrix A when some of the eigenvalues are repeated. But a symmetric matrix is an exception. It can be proved that a set of n linearly independent eigenvectors can always be found for an n × n symmetric matrix A even when there is some repetition of the eigenvalues. (See Example 4 of Section 8.8.)

See (2) of Section 7.6 for the definition of the inner product in Rn.

A set of vectors x1, x2, . . . , xn in Rn is called orthonormal if every pair of distinct vectors is orthogonal and each vector in the set is a unit vector. In terms of the inner product of vectors, the set is orthonormal if

xi · xj = 0, ij, i, j = 1, 2, . . ., n and xi · xi = 1, i = 1, 2, . . ., n.

The last condition simply states that = 1, i = 1, 2, . . . , n.

Orthogonal Matrix

The concept of an orthonormal set of vectors plays an important role in the consideration of the next type of matrix.

DEFINITION 8.10.2 Orthogonal Matrix

An n × n nonsingular matrix A is orthogonal if A−1 = AT.

In other words, A is orthogonal if ATA = I.

EXAMPLE 2 Orthogonal Matrices

(a) The n × n identity matrix I is an orthogonal matrix. For example, in the case of the 3 × 3 identity

it is readily seen that IT = I and IT I = I I = I.

(b) The matrix

is orthogonal. To see this, we need only verify that ATA = I:

THEOREM 8.10.3 Criterion for an Orthogonal Matrix

An n × n matrix A is orthogonal if and only if its columns X1, X2, . . ., Xn form an orthonormal set.

PARTIAL PROOF:

Let us suppose that A is an n × n orthogonal matrix with columns X1, X2, . . ., Xn. Hence, the rows of AT are . But since A is orthogonal, ATA = I; that is,

.

It follows from the definition of equality of matrices that

This means that the columns of the orthogonal matrix form an orthonormal set of n vectors.

If we write the columns of the matrix in part (b) of Example 2 as

then the vectors are orthogonal:

and are unit vectors:

Constructing an Orthogonal Matrix

If an n × n real symmetric matrix A possesses n distinct eigenvalues λ1, λ2, . . . , λn, it follows from Theorem 8.10.2 that the corresponding eigenvectors K1, K2, . . . , Kn are mutually orthogonal. By multiplying each vector by the reciprocal of its norm, we obtain a set of mutually orthogonal unit vectors—that is, an orthonormal set. We can then construct an orthogonal matrix by forming an n × n matrix P whose columns are these normalized eigenvectors of A.

EXAMPLE 3 Constructing an Orthogonal Matrix

In Example 1, we verified that the eigenvectors

of the given symmetric matrix A are orthogonal. Now the norms of the eigenvectors are

Thus, an orthonormal set of vectors is

Using these vectors as columns, we obtain the orthogonal matrix

You should verify that PT = P−1.

We will use the technique of constructing an orthogonal matrix from the eigenvectors of a symmetric matrix in the next section.

Do not misinterpret Theorem 8.10.2. We can always find n linearly independent eigenvectors for an n × n real symmetric matrix A. However, the theorem does not state that all the eigenvectors are mutually orthogonal. The set of eigenvectors corresponding to distinct eigenvalues are orthogonal, but the different eigenvectors corresponding to a repeated eigenvalue may not be orthogonal. Consider the symmetric matrix in the next example.

EXAMPLE 4 Using the Gram–Schmidt Process

For the symmetric matrix

we find that the eigenvalues are λ1 = λ2 = −9, and λ3 = 9. Proceeding as in Section 8.8, for λ1 = λ2 = −9, we find

From the last matrix we see that k1 = −k2 + k3. The choices k2 = 1, k3 = 1 followed by k2 = −4, k3 = 0 yield, in turn, the distinct eigenvectors

Now for λ1 = 9,

indicates that is a third eigenvector.

Observe, in accordance with Theorem 8.10.2, the vector K3 is orthogonal to K1 and to K2, but K1 and K2, eigenvectors corresponding to the repeated eigenvalue λ1 = −9, are not orthogonal since K1 · K2 = −4 ≠ 0.

We use the Gram–Schmidt orthogonalization process (see pages 366–369) to transform the set {K1, K2} into an orthogonal set. Let V1 = K1 and then

The set {V1, V2} is an orthogonal set of vectors (verify). Moreover, the set {V1, V2, K3} is an orthogonal set of eigenvectors. Using the norms ||V1|| = , ||V2|| = 3, and ||K3|| = 3, we obtain an orthonormal set of vectors

and so the matrix

is orthogonal.

REMARKS

For a real n × n symmetric matrix with repeated eigenvalues it is always possible to find, rather than construct, a set of n mutually orthogonal eigenvectors. In other words, the Gram–Schmidt process does not have to be used. See Problem 23 in Exercises 8.10.

8.10 Exercises Answers to selected odd-numbered problems begin on page ANS-20.

In Problems 1– 4, (a) verify that the indicated column vectors are eigenvectors of the given symmetric matrix, (b) identify the corresponding eigenvalues, and (c) verify that the column vectors are orthogonal.

In Problems 5–10, determine whether the given matrix is orthogonal.

In Problems 11–18, proceed as in Example 3 to construct an orthogonal matrix from the eigenvectors of the given symmetric matrix. (The answers are not unique.)

In Problems 19 and 20, use Theorem 8.10.3 to find values of a and b so that the given matrix is orthogonal.

In Problems 21 and 22, (a) verify that the indicated column vectors are eigenvectors of the given symmetric matrix and (b) identify the corresponding eigenvalues. (c) Proceed as in Example 4 and use the Gram–Schmidt process to construct an orthogonal matrix P from the eigenvectors.

  1. In Example 4, use the equation k1 = − k2 + k3 and choose two different sets of values for k2 and k3 so that the corresponding eigenvectors K1 and K2 are orthogonal.
  2. Construct an orthogonal matrix from the eigenvectors of

  3. Suppose A and B are orthogonal matrices. Then show that AB is orthogonal.
  4. Suppose A is an orthogonal matrix. Is orthogonal?
  5. Suppose A is an orthogonal matrix. Then show that is orthogonal.
  6. Suppose A is an orthogonal matrix. Then show that
  7. Suppose A is an orthogonal matrix such that . Then show that .
  8. Show that the rotation matrix

    is orthogonal.

 

*Since a 1 × 1 matrix is simply a scalar, we will hereafter drop the parentheses and write XT Y = x1 y1 + x2 y2 + . . . + xnyn.