8.11 Approximation of Eigenvalues

INTRODUCTION

Recall, to find the eigenvalues for a matrix A we must find the roots of the polynomial equation p(λ) = det(A − λI) = 0. If A is a large matrix, the computations involved in obtaining this characteristic equation can be overwhelming. Moreover, even if we can find the exact characteristic equation it is likely that we would have to use a numerical procedure to approximate its roots. There are alternative numerical procedures for approximating eigenvalues and the corresponding eigenvectors. The procedure that we shall consider in this section deals with matrices that possess a dominant eigenvalue.

A Definition

A dominant eigenvalue of a square matrix A is one whose absolute value is greater than the absolute value of each of the remaining eigenvalues. We formalize the last sentence in the next definition.

DEFINITION 8.11.1 Dominant Eigenvalue

Let λ₁, λ₂, . . . , λ_k, . . . , λ_n denote the eigenvalues of an n × n matrix A. The eigenvalue λ_k is said to be the dominant eigenvalue of A if

|λ_k| > λ_i, i = 1, 2, . . . , n, but i ≠ k.

An eigenvector corresponding to λ_k is called the dominant eigenvector of A.

In Example 2 of Section 8.8, we saw that the eigenvalues of the matrix

are λ₁ = 0, λ₂ = −4, and λ₃ = 3. Since |−4| > 0 and |−4| > 3, we see that λ₂ = −4 is the dominant eigenvalue of A.

EXAMPLE 1 Matrices with No Dominant Eigenvalue

(a) The matrix A = has eigenvalues λ₁ = −2 and λ₂ = 2. Since |λ₁| = |λ₂| = 2, it follows that there is no dominant eigenvalue.

(b) The eigenvalues of the matrix

A =

are λ₁ = 2, λ₂ = λ₃ = 5. Again, the matrix has no dominant eigenvalue. ≡

Power Method

Suppose the n × n matrix A has a dominant eigenvalue λ₁. The iterative technique for approximating a corresponding dominant eigenvector is due to the German mathematician Richard von Mises (1883–1953) and is called the power method. The basic idea of this procedure is first to compute an approximation to a dominant eigenvector by means of the sequence

(1)

where X₀ represents a nonzero n × 1 vector that is an initial guess or approximation for the eigenvector we seek. Iterating (1) gives

(2)

Under certain circumstances, for large values of m the vector defined by X_m = A^mX₀ is an approximation to a dominant eigenvector. To see this, let us make some further assumptions about the matrix A. Let us assume that the eigenvalues of A are such that

|λ₁| > |λ₂| ≥ |λ₃| ≥ ≥ |λ_n|

and that the corresponding n eigenvectors K₁, K₂, . . . , K_n are linearly independent. Because of this last assumption, K₁, K₂, . . . , K_n can serve as a basis for Rⁿ (see Section 7.6). Thus, for any nonzero n × 1 vector X₀, constants c₁, c₂, . . . , c_n can be found such that

X₀ = c₁K₁ + c₂K₂ + . . . + c_nK_n. (3)

We shall also assume X₀ is chosen so that c₁ ≠ 0. Multiplying (3) by A then gives

AX₀ = c₁AK₁ + c₂AK₂ + . . . + c_nAK_n.

Since AK₁ = λ₁K₁, AK₂ = λ₂K₂, . . . , AK_n = λ_nK_n, the last line becomes

AX₀ = c₁λ₁K₁ + c₂λ₂K₂ + . . . + c_nλ_nK_n. (4)

Multiplying (4) by A gives

Continuing in this manner, we find that

(5)

(6)

Since |λ₁| > |λ_i| for i = 2, 3, . . . , n, we have |λ_i/λ₁| < 1 and consequently lim_m→∞(λ_i/λ₁)^m = 0. Therefore, as m → ∞, we see from (6) that

(7)

Since a nonzero constant multiple of an eigenvector is an eigenvector, it follows from (7) that for large values of m and under all the assumptions that were made, the n × 1 matrix X_m = A^mX₀ is an approximation to a dominant eigenvector associated with the dominant eigenvalue λ₁. The rate at which this method converges depends on the quotient λ₂/λ₁: If |λ₂/λ₁| is very small, then convergence is fast, whereas if |λ₂/λ₁| is close to one, convergence is slow. Of course, this information is not as useful as it seems because we generally do not know the eigenvalues in advance.

It remains, then, to approximate the dominant eigenvalue itself. This can be done by means of the inner product. If K is an eigenvector of a matrix A corresponding to the eigenvalue λ, we have AK = λK, and so we have AK · K = λK · K. Since AK · K and K · K are scalars, we can solve this last equation for λ:

Hence, if X_m = A^mX₀ is an approximation to a dominant eigenvector obtained by the iteration of (1), then the dominant eigenvalue λ₁ can be approximated by the quotient

(8)

The quotient in (8) is sometimes referred to as the Rayleigh quotient.

EXAMPLE 2 Using the Power Method

Use the power method to approximate the dominant eigenvalue and a corresponding dominant eigenvector of A = .

SOLUTION

Since we do not know the eigenvalues and eigenvectors, we may as well take X₀ = . Now the first two terms of the sequence of vectors defined by (1) are

The next five vectors obtained in this manner are given in the following table:

At this point it may not appear that we are getting anywhere, since the entries of the vectors in the table appear to be growing without bound. But bear in mind that (7) indicates we are obtaining a constant multiple of a vector. If the power method converges, then by factoring the entry with the largest absolute value from X_m (for a large value of m), we should obtain a reasonable approximation to a dominant eigenvector. From the table,

X₇ = 89,304 (9)

It appears then that the vectors are approaching scalar multiples of .

We now use (8) to approximate the dominant eigenvalue λ₁. First we have

Finally, we have

λ₁ ≈ = 5.0006.

The reader should use the procedure of Section 8.7 to verify that the eigenvalues and corresponding eigenvectors of A are λ₁ = 5, λ₂ = −2, K₁ = , and K₂ = ≡

Scaling

As we have just seen, iteration of (1) often results in vectors whose entries become very large in absolute value. Large numbers can, of course, cause a problem if a computer is used to carry out a great number of iterations. The result in (9) suggests that one way around this difficulty is to use a scaled-down vector at each step of the iteration. To do this, we simply multiply the vector AX₀ by the reciprocal of the entry with the largest absolute value. That is, we multiply

This resulting matrix, whose entries are now less than or equal to one, we label X₁. We repeat the process with the vector AX₁ to obtain the scaled-down vector X₂, and so on.

EXAMPLE 3 Example 2 Revisited

Repeat the iterations of Example 2 using scaled-down vectors.

SOLUTION

From AX₀ = we define

From AX₁ = we define

X₂ =

We continue in this manner to construct the following table:

In contrast to the table in Example 2, it is apparent from this table that the vectors are approaching . ≡

Method of Deflation

After we have found the dominant eigenvalue λ₁ of a matrix A it may still be necessary to find nondominant eigenvalues. The procedure we shall consider next is a modification of the power method and is called the method of deflation. We will limit the discussion to the case where A is a symmetric matrix.

Suppose λ₁ and K₁ are, respectively, the dominant eigenvalue and a corresponding normalized eigenvector^* (that is, ||K₁|| = 1) of a symmetric matrix A. Furthermore, suppose the eigenvalues of A are such that

|λ₁| > |λ₂| > |λ₃| ≥ . . . ≥ |λ_n|.

It can be proved that the matrix

(10)

has eigenvalues 0, λ₂, λ₃, . . . , λ_n and that eigenvectors of B are also eigenvectors of A. Note that λ₂ is now the dominant eigenvalue of B. We apply the power method to B to approximate λ₂ and a corresponding eigenvector.

EXAMPLE 4 Using the Method of Deflation

Use the method of deflation to approximate the eigenvalues of

SOLUTION

We begin by using the power method with scaling to find the dominant eigenvalue and a corresponding eigenvector of A. Choosing X₀ = , we see that

The scaled vectors X₃ to X₁₀ are given in the following table:

Utilizing X₁₀ and (8), we find

λ₁ ≈ = 2.9997.

It appears that the dominant eigenvalue and a corresponding eigenvector are, respectively, λ₁ = 3 and K = .

Our next task is to construct the matrix B defined by (10). With ||K|| = , the normalized eigenvector is K₁ = . Thus,

We now use the power method with scaling to find the dominant eigenvalue of B. With X₀ = again, the results are summarized in the following table:

Utilizing X₇ and (8), we find

λ₂ ≈ = −1.9996.

From these computations it seems apparent that the dominant eigenvalue of B and a corresponding eigenvector are λ₂ = −2 and K = .

To find the last eigenvalue of A, we repeat the deflation process to find the dominant eigenvalue and a corresponding eigenvector of the matrix

where we have used K₂ = . The student is encouraged to verify that λ₃ = 1. ≡

Example 4 is somewhat artificial since eigenvalues of a matrix need not be “nice” numbers such as 3, −2, and 1. Moreover, we used the exact values of the dominant eigenvalues λ₁ and λ₂ in the formation of the matrices B and C. In practice, of course, we may have to be content with working with approximations to the dominant eigenvalue λ₁ and a corresponding normalized dominant eigenvector K₁ of A. If these approximations are used in (10), an error is introduced in the computation of the matrix B, and so more errors may be introduced in the computation of its dominant eigenvalue λ₂ and dominant eigenvector K₂. If λ₂ and K₂ are now used to construct the matrix C, it seems reasonable to conclude that errors are being compounded. In other words, the method of deflation can become increasingly inaccurate as more eigenvalues are computed.

Inverse Power Method

In some applied problems we are more interested in approximating the eigenvalue of a matrix A of smallest absolute value than the dominant eigenvalue. If A is nonsingular, then the eigenvalues of A are nonzero, and if λ₁, λ₂, . . . , λ_n are the eigenvalues of A, then 1/λ₁, 1/λ₂, . . . , 1/λ_n are the eigenvalues of A⁻¹. Now if the eigenvalues of A can be arranged in the order

|λ₁| ≥ |λ₂| ≥ |λ₃| ≥ . . . ≥ |λ_n−1| > |λ_n|,

then we see that 1/λ_n is the dominant eigenvalue of A⁻¹. By applying the power method to A⁻¹, we approximate the eigenvalue of largest magnitude and, by taking its reciprocal, we find the eigenvalue of A of least magnitude. This is called the inverse power method. See Problems 11–13 in Exercises 8.11.

Review (6) and Theorems 8.8.2 and 8.8.3 on pages 430–431.

8.11 Exercises Answers to selected odd-numbered problems begin on page ANS-20.

To the Instructor/Student: A calculator with matrix capabilities or a CAS would be useful in the following problems.

Each matrix in Problems 1–10 has a dominant eigenvalue.

In Problems 1 and 2, use the power method as illustrated in Example 3 to find the dominant eigenvalue and a corresponding dominant eigenvector of the given matrix.

In Problems 3–6, use the power method with scaling to find the dominant eigenvalue and a corresponding eigenvector of the given matrix.

In Problems 7–10, use the method of deflation to find the eigenvalues of the given matrix.

In Problems 11 and 12, use the inverse power method to find the eigenvalue of least magnitude for the given matrix.

In Example 4 of Section 3.9 we saw that the deflection curve of a thin column under an applied load P was defined by the boundary-value problem
EI + Py = 0, y(0) = 0, y(L) = 0.

In this problem we show how to apply matrix techniques to compute the smallest critical load.

Let the interval [0, L] be divided into n subintervals of length h = L/n, and let x_i = ih, i = 0, 1, . . . , n. For small values of h it follows from (6) of Section 6.5 that

where y_i = y(x_i).
1. Show that the differential equation can be replaced by the difference equation
  EI(y_i+1 − 2y_i + y_i−1) + Ph²y_i = 0, i = 1, 2, . . ., n − 1.
2. Show that for n = 4 the difference equation in part (a) yields the system of linear equations
  
  Note that this system has the form of the eigenvalue problem AY = λY, where λ = PL²/16EI.
3. Find A⁻¹.
4. Use the inverse power method to find, to two decimal places, the eigenvalue of A of least magnitude.
5. Use the result of part (d) to compute the approximate smallest critical load. Compare your answer with that given in Section 3.9.
Suppose the column in Problem 13 is tapered so that the moment of inertia of a cross-section I varies linearly from I(0) = I₀ = 0.002 to I(L) = I_L = 0.001.
1. Use the difference equation in part (a) of Problem 13 with n = 4 to set up a system of equations analogous to that given in part (b).
2. Proceed as in Problem 13 to find an approximation to the smallest critical load.

Computer Lab Assignment

In Section 8.9 we saw how to compute a power A^m for an n × n matrix A. Consult the documentation for the CAS you have on hand for the command to compute the power A^m. (In Mathematica the command is MatrixPower[A, m].) The matrix

possesses a dominant eigenvalue.
1. Use a CAS to compute A¹⁰.
2. Now use (2), X_m = A^mX₀, with m = 10 and
  X₀ = , to compute X₁₀. In the same manner compute X₁₂. Then proceed as in (9) to find the approximate dominant eigenvector K.
3. If K is an eigenvector of A, then AK = λK. Use this definition and the result in part (b) to find the dominant eigenvalue.

*See Example 3 in Section 8.10.