8.12 Diagonalization

INTRODUCTION

In Chapter 10 we shall see that eigenvalues, eigenvectors, orthogonal matrices, and the topic of this present section, diagonalization, are important tools in the solution of systems of linear first-order differential equations. The basic question that we shall consider in this section is

For an n × n matrix A, can we find an n × n nonsingular matrix P such that P⁻¹AP = D is a diagonal matrix?

A Special Notation

We begin by introducing a shorthand notation for the product of two n × n matrices. This notation will be useful in proving the principal theorem of this section. To illustrate, suppose A and B are 2 × 2 matrices. Then

(1)

If we write the columns of the matrix B as vectors X₁ = and X₂ = , then column 1 and column 2 in the product (1) can be expressed as the products AX₁ and AX₂. That is,

In general, for two n × n matrices

AB = A(X₁ X₂ . . . X_n) = (AX₁ AX₂ . . . AX_n), (2)

where X₁, X₂, . . . , X_n, are the columns of B.

Diagonalizable Matrix

If an n × n nonsingular matrix P can be found so that P⁻¹AP = D is a diagonal matrix, then we say that the n × n matrix A can be diagonalized, or is diagonalizable, and that P diagonalizes A.

To discover how to diagonalize a matrix, let us assume for the sake of discussion that A is a 3 × 3 diagonalizable matrix. Then there exists a 3 × 3 nonsingular matrix P such that P⁻¹AP = D or AP = PD, where D is a diagonal matrix

If P₁, P₂, and P₃ denote the columns of P, then it follows from (2) that the equation AP = PD is the same as

(AP₁ AP₂ AP₃) = (d₁₁P₁ d₂₂P₂ d₃₃P₃)

or AP₁ = d₁₁P₁, AP₂ = d₂₂P₂, AP₃ = d₃₃P₃.

But by Definition 8.8.1 we see that d₁₁, d₂₂, and d₃₃ are eigenvalues of A associated with the eigenvectors P₁, P₂, and P₃. These eigenvectors are linearly independent, since P was assumed to be nonsingular.

We have just discovered, in a particular case, that if A is diagonalizable, then the columns of the diagonalizing matrix P consist of linearly independent eigenvectors of A. Since we wish to diagonalize a matrix, we are really concerned with the validity of the converse of the last sentence. In other words, if we can find n linearly independent eigenvectors of an n × n matrix A and form an n × n matrix P whose columns consist of these eigenvectors, then does P diagonalize A? The answer is yes and will be proved in the next theorem.

THEOREM 8.12.1 Sufficient Condition for Diagonalizability

If an n × n matrix A has n linearly independent eigenvectors K₁, K₂, . . . , K_n, then A is diagonalizable.

PROOF:

We shall prove the theorem in the case when A is a 3 × 3 matrix. Let K₁, K₂, and K₃ be linearly independent eigenvectors corresponding to eigenvalues λ₁, λ₂, and λ₃; that is,

AK₁ = λ₁K₁, AK₂ = λ₂K₂, and AK₃ = λ₃K₃. (3)

Next form the 3 × 3 matrix P with column vectors K₁, K₂, and K₃: P = (K₁ K₂ K₃). P is non-singular since, by hypothesis, the eigenvectors are linearly independent. Now using (2) and (3), we can write the product AP as

Multiplying the last equation on the left by P⁻¹ then gives P⁻¹AP = D. ≡

Note carefully in the proof of Theorem 8.12.1 that the entries in the diagonalized matrix are the eigenvalues of A and the order in which these numbers appear on the diagonal of D corresponds to the order in which the eigenvectors are used as columns in the matrix P.

In view of the motivational discussion preceding Theorem 8.12.1, we can state the general result:

THEOREM 8.12.2 Criterion for Diagonalizability

An n × n matrix A is diagonalizable if and only if A has n linearly independent eigenvectors.

We saw in Section 8.8 that an n × n matrix A has n linearly independent eigenvectors whenever it possesses n distinct eigenvalues.

THEOREM 8.12.3 Sufficient Condition for Diagonalizability

If an n × n matrix A has n distinct eigenvalues, it is diagonalizable.

EXAMPLE 1 Diagonalizing a Matrix

Diagonalize A = if possible.

SOLUTION

First we find the eigenvalues of A. The characteristic equation is det(A − λI) = = λ² − 5λ + 4 = (λ − 1)(λ − 4) = 0. The eigenvalues are λ₁ = 1 and λ₂ = 4. Since the eigenvalues are distinct, we know from Theorem 8.12.3 that A is diagonalizable.

Next the eigenvectors of A corresponding to λ₁ = 1 and λ₂ = 4 are, respectively,

K₁ = and K₂ = .

Using these vectors as columns, we find that the nonsingular matrix P that diagonalizes A is

P = (K₁ K₂) = .

Now P⁻¹ = ,

and so carrying out the multiplication gives

P⁻¹AP = = D. ≡

In Example 1, had we reversed the columns in P, that is, P = , then the diagonal matrix would have been D = .

EXAMPLE 2 Diagonalizing a Matrix

Consider the matrix A = . We saw in Example 2 of Section 8.8 that the eigenvalues and corresponding eigenvectors are

λ₁ = 0, λ₂ = −4, λ₃ = 3, K₁ = , K₂ = , K₃ = .

Since the eigenvalues are distinct, A is diagonalizable. We form the matrix

P = (K₁ K₂ K₃) = .

Matching the eigenvalues with the order in which the eigenvectors appear in P, we know that the diagonal matrix will be

D = .

Now from either of the methods of Section 8.6 we find

P⁻¹ = ,

and so ≡

The condition that an n × n matrix A have n distinct eigenvalues is sufficient—that is, a guarantee—that A is diagonalizable. The condition that there be n distinct eigenvalues is not a necessary condition for the diagonalization of A. In other words, if the matrix A does not have n distinct eigenvalues, then it may or may not be diagonalizable.

A matrix with repeated eigenvalues could be diagonalizable.

EXAMPLE 3 A Matrix That Is Not Diagonalizable

In Example 3 of Section 8.8 we saw that the matrix A = has a repeated eigenvalue λ₁ = λ₂ = 5. Correspondingly we were able to find only a single eigenvector K₁ = . We conclude from Theorem 8.12.2 that A is not diagonalizable. ≡

EXAMPLE 4 Repeated Eigenvalues Yet Diagonalizable

The eigenvalues of the matrix A = are λ₁ = −1 and λ₂ = λ₃ = 1.

For λ₁ = −1 we find K₁ = . For the repeated eigenvalue λ₂ = λ₃ = 1, Gauss–Jordan elimination gives

From the last matrix we see that k₁ − k₂ = 0. Since k₃ is not determined from the last matrix, we can choose its value arbitrarily. The choice k₂ = 1 gives k₁ = 1. If we then pick k₃ = 0, we get the eigenvector

K₂ = .

The alternative choice k₂ = 0 gives k₁ = 0. If k₃ = 1, we get another eigenvector corresponding to λ₂ = λ₃ = 1:

K₃ = .

Since the eigenvectors K₁, K₂, and K₃ are linearly independent, a matrix that diagonalizes A is

P = .

Matching the eigenvalues with the eigenvectors in P, we have P⁻¹AP = D, where

D = . ≡

Symmetric Matrices

An n × n symmetric matrix A with real entries can always be diagonalized. This is a consequence of the fact that we can always find n linearly independent eigenvectors for such a matrix. Moreover, since we can find n mutually orthogonal eigenvectors, we can use an orthogonal matrix P to diagonalize A. A symmetric matrix is said to be orthogonally diagonalizable.

THEOREM 8.12.4 Criterion for Orthogonal Diagonalizability

An n × n matrix A can be orthogonally diagonalized if and only if A is symmetric.

PARTIAL PROOF:

We shall prove the necessity part (that is, the “only if” part) of the theorem. Assume an n × n matrix A is orthogonally diagonalizable. Then there exists an orthogonal matrix P such that P⁻¹AP = D or A = PDP⁻¹. Since P is orthogonal, P⁻¹ = P^T and consequently A = PDP^T. But from (i) and (iii) of Theorem 8.1.2 and the fact that a diagonal matrix is symmetric, we have

A^T = (PDP^T)^T = (P^T)^TD^TP^T = PDP^T = A.

Thus, A is symmetric. ≡

EXAMPLE 5 Diagonalizing a Symmetric Matrix

Consider the symmetric matrix A = . We saw in Example 4 of Section 8.8 that the eigenvalues and corresponding eigenvectors are

λ₁ = 11, λ₂ = λ₃ = 8, K₁ = , K₂ = , K₃ = .

See the Remarks on page 458.

The eigenvectors K₁, K₂, and K₃ are linearly independent, but note that they are not mutually orthogonal since K₂ and K₃, the eigenvectors corresponding to the repeated eigenvalue λ₂ = λ₃ = 8, are not orthogonal. For λ₂ = λ₃ = 8, we found the eigenvectors from Gauss–Jordan elimination

which implies that k₁ + k₂ + k₃ = 0. Since two of the variables are arbitrary, we selected k₂ = 1, k₃ = 0 to obtain K₂, and k₂ = 0, k₃ = 1 to obtain K₃. Now if instead we choose k₂ = 1, k₃ = 1 and then k₂ = 1, k₃ = −1, we obtain, respectively, two entirely different but orthogonal eigenvectors:

K₂ = and K₃ = .

Thus, a new set of mutually orthogonal eigenvectors is

K₁ = , K₂ = , K₃ = .

Multiplying these vectors, in turn, by the reciprocals of the norms ||K₁|| = , ||K₂|| = , and ||K₃|| = , we obtain an orthonormal set

We then use these vectors as columns to construct an orthogonal matrix that diagonalizes A:

The diagonal matrix whose entries are the eigenvalues of A corresponding to the order in which the eigenvectors appear in P is then

D = .

This is verified from

≡

Quadratic Forms

An algebraic expression of the form

ax² + bxy + cy² (4)

is said to be a quadratic form. If we let X = , then (4) can be written as the matrix product

X^TAX = (x y) (5)

Observe that the matrix is symmetric.

In calculus you may have seen that an appropriate rotation of axes enables us to eliminate the xy-term in an equation:

ax² + bxy + cy² + dx + ey + f = 0.

As the next example will illustrate, we can eliminate the xy-term by means of an orthogonal matrix and diagonalization rather than by using trigonometry.

EXAMPLE 6 Identifying a Conic Section

Identify the conic section whose equation is 2x² + 4xy − y² = 1.

SOLUTION

From (5) we can write the given equation as

(x y) = 1 or X^TAX = 1, (6)

where A = and X = . Now the eigenvalues and corresponding eigenvectors of A are found to be

λ₁ = −2, λ₂ = 3, K₁ = , K₂ = .

Observe that K₁ and K₂ are orthogonal. Moreover, ||K₁|| = ||K₂|| = , and so the vectors

are orthonormal. Hence, the matrix

is orthogonal. If we define the change of variables X = PX′ where X′ = , then the quadratic form 2x² + 4xy − y² can be written

X^TAX = (X′)^T P^TAPX′ = (X′)^T(P^TAP)X′.

Since P orthogonally diagonalizes the symmetric matrix A, the last equation is the same as

X^TAX = (X′)^T DX′. (7)

Using (7), we see that (6) becomes

(X Y) = 1 or −2X² + 3Y² = 1.

This last equation is recognized as the standard form of a hyperbola. The xy-coordinates of the eigenvectors are (1, −2) and (2, 1). Using the substitution X = PX′ in the form X′ = P⁻¹X = P^T X, we find that the XY-coordinates of these two points are (, 0) and (0, ), respectively. From this we conclude that the X-axis and Y-axis are as shown in FIGURE 8.12.1. The eigenvectors, shown in red in the figure, lie along the new axes. The X- and Y-axes are called the principal axes of the conic. ≡

An x y coordinate plane with one convex curve and one concave curve graphed in it. The concave curve enters the top right in the first quadrant and exits the bottom right in the fourth quadrant. The convex curve enters the top left in the second quadrant and exits the bottom left in the fourth quadrant. A line labeled X enters passes through the origin and the point (1 -2). The concave and convex curves are symmetrical to one another with respect to the line X. A line labeled Y is perpendicular to the line X and passes through the origin and the point (2, 1). Two vectors are traced in red color. The first vector begins at the point (0, 0) and ends at the point (2, 1). The second vector begins at the point (0, 0) and ends at the point (1, -2). — FIGURE 8.12.1 X- and Y-axes in Example 6

REMARKS

The matrix A in Example 5 is symmetric and as such eigenvectors corresponding to distinct eigenvalues are orthogonal. In the third line of the example, note that K₁, an eigenvector for λ₁ = 11, is orthogonal to both K₂ and K₃. The eigenvectors K₂ = and K₃ = corresponding to λ₂ = λ₃ = 8 are not orthogonal. As an alternative to searching for orthogonal eigenvectors for this repeated eigenvalue by performing Gauss–Jordan elimination a second time, we could simply apply the Gram–Schmidt orthogonalization process and transform the set {K₂, K₃} into an orthogonal set. See Section 7.7 and Example 4 in Section 8.10.

8.12 Exercises Answers to selected odd-numbered problems begin on page ANS-20.

In Problems 1–20, determine whether the given matrix A is diagonalizable. If so, find the matrix P that diagonalizes A and the diagonal matrix D such that D = P⁻¹AP.

In Problems 21–30, the given matrix A is symmetric. Find an orthogonal matrix P that diagonalizes A and the diagonal matrix D such that D = P^TAP.

In Problems 31–34, use the procedure illustrated in Example 6 to identify the given conic section. Graph.

5x² − 2xy + 5y² = 24
13x² − 10xy + 13y² = 288
−3x² + 8xy + 3y² = 20
16x² + 24xy + 9y² − 3x + 4y = 0
Find a 2 × 2 matrix A that has eigenvalues λ₁ = 2 and λ₂ = 3 and corresponding eigenvectors
K₁ = and K₂ = .
Find a 3 × 3 symmetric matrix that has eigenvalues λ₁ = 1, λ₂ = 3, and λ₃ = 5 and corresponding eigenvectors
K₁ = , K₂ = , and K₃ = .
If A is an n × n diagonalizable matrix, then D = P⁻¹AP, where D is a diagonal matrix. Show that if m is a positive integer, then A^m = PD^mP⁻¹.
The mth power of a diagonal matrix
is .

Use this result to compute

In Problems 39 and 40, use the results of Problems 37 and 38 to find the indicated power of the given matrix.

A = , A⁵
A = , A¹⁰
Suppose A is a nonsingular diagonalizable matrix. Then show that A⁻¹ is diagonalizable.
Suppose A is a diagonalizable matrix. Is the matrix P unique?