8.12 Diagonalization

INTRODUCTION

In Chapter 10 we shall see that eigenvalues, eigenvectors, orthogonal matrices, and the topic of this present section, diagonalization, are important tools in the solution of systems of linear first-order differential equations. The basic question that we shall consider in this section is

For an n × n matrix A, can we find an n × n nonsingular matrix P such that P−1AP = D is a diagonal matrix?

A Special Notation

We begin by introducing a shorthand notation for the product of two n × n matrices. This notation will be useful in proving the principal theorem of this section. To illustrate, suppose A and B are 2 × 2 matrices. Then

(1)

If we write the columns of the matrix B as vectors X1 = and X2 = , then column 1 and column 2 in the product (1) can be expressed as the products AX1 and AX2. That is,

In general, for two n × n matrices

AB = A(X1 X2 . . . Xn) = (AX1 AX2 . . . AXn), (2)

where X1, X2, . . . , Xn, are the columns of B.

Diagonalizable Matrix

If an n × n nonsingular matrix P can be found so that P−1AP = D is a diagonal matrix, then we say that the n × n matrix A can be diagonalized, or is diagonalizable, and that P diagonalizes A.

To discover how to diagonalize a matrix, let us assume for the sake of discussion that A is a 3 × 3 diagonalizable matrix. Then there exists a 3 × 3 nonsingular matrix P such that P−1AP = D or AP = PD, where D is a diagonal matrix

If P1, P2, and P3 denote the columns of P, then it follows from (2) that the equation AP = PD is the same as

(AP1 AP2 AP3) = (d11P1 d22P2 d33P3)

or AP1 = d11P1, AP2 = d22P2, AP3 = d33P3.

But by Definition 8.8.1 we see that d11, d22, and d33 are eigenvalues of A associated with the eigenvectors P1, P2, and P3. These eigenvectors are linearly independent, since P was assumed to be nonsingular.

We have just discovered, in a particular case, that if A is diagonalizable, then the columns of the diagonalizing matrix P consist of linearly independent eigenvectors of A. Since we wish to diagonalize a matrix, we are really concerned with the validity of the converse of the last sentence. In other words, if we can find n linearly independent eigenvectors of an n × n matrix A and form an n × n matrix P whose columns consist of these eigenvectors, then does P diagonalize A? The answer is yes and will be proved in the next theorem.

THEOREM 8.12.1 Sufficient Condition for Diagonalizability

If an n × n matrix A has n linearly independent eigenvectors K1, K2, . . . , Kn, then A is diagonalizable.

PROOF:

We shall prove the theorem in the case when A is a 3 × 3 matrix. Let K1, K2, and K3 be linearly independent eigenvectors corresponding to eigenvalues λ1, λ2, and λ3; that is,

AK1 = λ1K1, AK2 = λ2K2, and AK3 = λ3K3. (3)

Next form the 3 × 3 matrix P with column vectors K1, K2, and K3: P = (K1 K2 K3). P is non-singular since, by hypothesis, the eigenvectors are linearly independent. Now using (2) and (3), we can write the product AP as

Multiplying the last equation on the left by P−1 then gives P−1AP = D.

Note carefully in the proof of Theorem 8.12.1 that the entries in the diagonalized matrix are the eigenvalues of A and the order in which these numbers appear on the diagonal of D corresponds to the order in which the eigenvectors are used as columns in the matrix P.

In view of the motivational discussion preceding Theorem 8.12.1, we can state the general result:

THEOREM 8.12.2 Criterion for Diagonalizability

An n × n matrix A is diagonalizable if and only if A has n linearly independent eigenvectors.

We saw in Section 8.8 that an n × n matrix A has n linearly independent eigenvectors whenever it possesses n distinct eigenvalues.

THEOREM 8.12.3 Sufficient Condition for Diagonalizability

If an n × n matrix A has n distinct eigenvalues, it is diagonalizable.

EXAMPLE 1 Diagonalizing a Matrix

Diagonalize A = if possible.

SOLUTION

First we find the eigenvalues of A. The characteristic equation is det(AλI) = = λ2 − 5λ + 4 = (λ − 1)(λ − 4) = 0. The eigenvalues are λ1 = 1 and λ2 = 4. Since the eigenvalues are distinct, we know from Theorem 8.12.3 that A is diagonalizable.

Next the eigenvectors of A corresponding to λ1 = 1 and λ2 = 4 are, respectively,

K1 = and K2 = .

Using these vectors as columns, we find that the nonsingular matrix P that diagonalizes A is

P = (K1 K2) = .

Now P−1 = ,

and so carrying out the multiplication gives

P−1AP = = D.

In Example 1, had we reversed the columns in P, that is, P = , then the diagonal matrix would have been D = .

EXAMPLE 2 Diagonalizing a Matrix

Consider the matrix A = . We saw in Example 2 of Section 8.8 that the eigenvalues and corresponding eigenvectors are

λ1 = 0, λ2 = −4, λ3 = 3, K1 = , K2 = , K3 = .

Since the eigenvalues are distinct, A is diagonalizable. We form the matrix

P = (K1 K2 K3) = .

Matching the eigenvalues with the order in which the eigenvectors appear in P, we know that the diagonal matrix will be

D = .

Now from either of the methods of Section 8.6 we find

P−1 = ,

and so

The condition that an n × n matrix A have n distinct eigenvalues is sufficient—that is, a guarantee—that A is diagonalizable. The condition that there be n distinct eigenvalues is not a necessary condition for the diagonalization of A. In other words, if the matrix A does not have n distinct eigenvalues, then it may or may not be diagonalizable.

A matrix with repeated eigenvalues could be diagonalizable.

EXAMPLE 3 A Matrix That Is Not Diagonalizable

In Example 3 of Section 8.8 we saw that the matrix A = has a repeated eigenvalue λ1 = λ2 = 5. Correspondingly we were able to find only a single eigenvector K1 = . We conclude from Theorem 8.12.2 that A is not diagonalizable.

EXAMPLE 4 Repeated Eigenvalues Yet Diagonalizable

The eigenvalues of the matrix A = are λ1 = −1 and λ2 = λ3 = 1.

For λ1 = −1 we find K1 = . For the repeated eigenvalue λ2 = λ3 = 1, Gauss–Jordan elimination gives

From the last matrix we see that k1k2 = 0. Since k3 is not determined from the last matrix, we can choose its value arbitrarily. The choice k2 = 1 gives k1 = 1. If we then pick k3 = 0, we get the eigenvector

K2 = .

The alternative choice k2 = 0 gives k1 = 0. If k3 = 1, we get another eigenvector corresponding to λ2 = λ3 = 1:

K3 = .

Since the eigenvectors K1, K2, and K3 are linearly independent, a matrix that diagonalizes A is

P = .

Matching the eigenvalues with the eigenvectors in P, we have P−1AP = D, where

D = .

Symmetric Matrices

An n × n symmetric matrix A with real entries can always be diagonalized. This is a consequence of the fact that we can always find n linearly independent eigenvectors for such a matrix. Moreover, since we can find n mutually orthogonal eigenvectors, we can use an orthogonal matrix P to diagonalize A. A symmetric matrix is said to be orthogonally diagonalizable.

THEOREM 8.12.4 Criterion for Orthogonal Diagonalizability

An n × n matrix A can be orthogonally diagonalized if and only if A is symmetric.

PARTIAL PROOF:

We shall prove the necessity part (that is, the “only if” part) of the theorem. Assume an n × n matrix A is orthogonally diagonalizable. Then there exists an orthogonal matrix P such that P−1AP = D or A = PDP−1. Since P is orthogonal, P−1 = PT and consequently A = PDPT. But from (i) and (iii) of Theorem 8.1.2 and the fact that a diagonal matrix is symmetric, we have

AT = (PDPT)T = (PT)TDTPT = PDPT = A.

Thus, A is symmetric.

EXAMPLE 5 Diagonalizing a Symmetric Matrix

Consider the symmetric matrix A = . We saw in Example 4 of Section 8.8 that the eigenvalues and corresponding eigenvectors are

λ1 = 11, λ2 = λ3 = 8, K1 = , K2 = , K3 = .

See the Remarks on page 458.

The eigenvectors K1, K2, and K3 are linearly independent, but note that they are not mutually orthogonal since K2 and K3, the eigenvectors corresponding to the repeated eigenvalue λ2 = λ3 = 8, are not orthogonal. For λ2 = λ3 = 8, we found the eigenvectors from Gauss–Jordan elimination

which implies that k1 + k2 + k3 = 0. Since two of the variables are arbitrary, we selected k2 = 1, k3 = 0 to obtain K2, and k2 = 0, k3 = 1 to obtain K3. Now if instead we choose k2 = 1, k3 = 1 and then k2 = 1, k3 = −1, we obtain, respectively, two entirely different but orthogonal eigenvectors:

K2 = and K3 = .

Thus, a new set of mutually orthogonal eigenvectors is

K1 = , K2 = , K3 = .

Multiplying these vectors, in turn, by the reciprocals of the norms ||K1|| = , ||K2|| = , and ||K3|| = , we obtain an orthonormal set

We then use these vectors as columns to construct an orthogonal matrix that diagonalizes A:

The diagonal matrix whose entries are the eigenvalues of A corresponding to the order in which the eigenvectors appear in P is then

D = .

This is verified from

Quadratic Forms

An algebraic expression of the form

ax2 + bxy + cy2 (4)

is said to be a quadratic form. If we let X = , then (4) can be written as the matrix product

XTAX = (x y) (5)

Observe that the matrix is symmetric.

In calculus you may have seen that an appropriate rotation of axes enables us to eliminate the xy-term in an equation:

ax2 + bxy + cy2 + dx + ey + f = 0.

As the next example will illustrate, we can eliminate the xy-term by means of an orthogonal matrix and diagonalization rather than by using trigonometry.

EXAMPLE 6 Identifying a Conic Section

Identify the conic section whose equation is 2x2 + 4xyy2 = 1.

SOLUTION

From (5) we can write the given equation as

(x y) = 1 or XTAX = 1, (6)

where A = and X = . Now the eigenvalues and corresponding eigenvectors of A are found to be

λ1 = −2, λ2 = 3, K1 = , K2 = .

Observe that K1 and K2 are orthogonal. Moreover, ||K1|| = ||K2|| = , and so the vectors

are orthonormal. Hence, the matrix

is orthogonal. If we define the change of variables X = PX′ where X′ = , then the quadratic form 2x2 + 4xyy2 can be written

XTAX = (X′)T PTAPX′ = (X′)T(PTAP)X′.

Since P orthogonally diagonalizes the symmetric matrix A, the last equation is the same as

XTAX = (X′)T DX′. (7)

Using (7), we see that (6) becomes

(X Y) = 1 or −2X2 + 3Y2 = 1.

This last equation is recognized as the standard form of a hyperbola. The xy-coordinates of the eigenvectors are (1, −2) and (2, 1). Using the substitution X = PX′ in the form X′ = P−1X = PT X, we find that the XY-coordinates of these two points are (, 0) and (0, ), respectively. From this we conclude that the X-axis and Y-axis are as shown in FIGURE 8.12.1. The eigenvectors, shown in red in the figure, lie along the new axes. The X- and Y-axes are called the principal axes of the conic.

An x y coordinate plane with one convex curve and one concave curve graphed in it. The concave curve enters the top right in the first quadrant and exits the bottom right in the fourth quadrant. The convex curve enters the top left in the second quadrant and exits the bottom left in the fourth quadrant. A line labeled X enters passes through the origin and the point (1 -2). The concave and convex curves are symmetrical to one another with respect to the line X. A line labeled Y is perpendicular to the line X and passes through the origin and the point (2, 1). Two vectors are traced in red color. The first vector begins at the point (0, 0) and ends at the point (2, 1). The second vector begins at the point (0, 0) and ends at the point (1, -2).

FIGURE 8.12.1 X- and Y-axes in Example 6

REMARKS

The matrix A in Example 5 is symmetric and as such eigenvectors corresponding to distinct eigenvalues are orthogonal. In the third line of the example, note that K1, an eigenvector for λ1 = 11, is orthogonal to both K2 and K3. The eigenvectors K2 = and K3 = corresponding to λ2 = λ3 = 8 are not orthogonal. As an alternative to searching for orthogonal eigenvectors for this repeated eigenvalue by performing Gauss–Jordan elimination a second time, we could simply apply the Gram–Schmidt orthogonalization process and transform the set {K2, K3} into an orthogonal set. See Section 7.7 and Example 4 in Section 8.10.

8.12 Exercises Answers to selected odd-numbered problems begin on page ANS-20.

In Problems 1–20, determine whether the given matrix A is diagonalizable. If so, find the matrix P that diagonalizes A and the diagonal matrix D such that D = P−1AP.

In Problems 21–30, the given matrix A is symmetric. Find an orthogonal matrix P that diagonalizes A and the diagonal matrix D such that D = PTAP.

In Problems 31–34, use the procedure illustrated in Example 6 to identify the given conic section. Graph.

  1. 5x2 − 2xy + 5y2 = 24
  2. 13x2 − 10xy + 13y2 = 288
  3. −3x2 + 8xy + 3y2 = 20
  4. 16x2 + 24xy + 9y2 − 3x + 4y = 0
  5. Find a 2 × 2 matrix A that has eigenvalues λ1 = 2 and λ2 = 3 and corresponding eigenvectors

    K1 = and K2 = .

  6. Find a 3 × 3 symmetric matrix that has eigenvalues λ1 = 1, λ2 = 3, and λ3 = 5 and corresponding eigenvectors

    K1 = , K2 = , and K3 = .

  7. If A is an n × n diagonalizable matrix, then D = P−1AP, where D is a diagonal matrix. Show that if m is a positive integer, then Am = PDmP−1.
  8. The mth power of a diagonal matrix

    is .

    Use this result to compute

In Problems 39 and 40, use the results of Problems 37 and 38 to find the indicated power of the given matrix.

  1. A = , A5
  2. A = , A10
  3. Suppose A is a nonsingular diagonalizable matrix. Then show that A−1 is diagonalizable.
  4. Suppose A is a diagonalizable matrix. Is the matrix P unique?