8.2 Systems of Linear Equations

INTRODUCTION

Recall, any equation of the form ax + by = c, where a, b, and c are real numbers, is said to be a linear equation in the variables x and y. The graph of a linear equation in two variables is a straight line. For real numbers a, b, c, and d, ax + by + cz = d is a linear equation in the variables x, y, and z and is the equation of a plane in 3-space. In general, an equation of the form

a₁x₁ + a₂x₂ + . . . + a_nx_n = b_n,

where a₁, a₂, . . ., a_n, and b_n are real numbers, is a linear equation in the n variables x₁, x₂, . . ., x_n .

In this section we will study systems of linear equations. Systems of linear equations are also called linear systems.

General Form

A system of m linear equations in n variables, or unknowns, has the general form

(1)

The coefficients of the variables in the linear system (1) can be abbreviated as a_ij, where i denotes the row and j denotes the column in which the coefficient appears. For example, a₂₃ is the coefficient of the unknown in the second row and third column (that is, x₃). Thus, i = 1, 2, 3, . . ., m and j = 1, 2, 3, . . ., n. The numbers b₁, b₂, . . ., b_m are called the constants of the system. If all the constants are zero, the system (1) is said to be homogeneous; otherwise it is nonhomogeneous. For example,

Solution

A solution of a linear system (1) is a set of n numbers x₁, x₂, . . ., x_n that satisfies each equation in the system. For example, x₁ = 3, x₂ = −1 is a solution of the system

To see this we replace x₁ by 3 and x₂ by −1 in each equation:

3(3) + 6(−1) = 9 − 6 = 3 and 3 − 4(−1) = 3 + 4 = 7.

Solutions of linear systems are also written as an ordered n-tuple (x₁, x₂, . . ., x_n). The solution for the above system is then the ordered pair (3, −1).

A linear system of equations is said to be consistent if it has at least one solution and inconsistent if it has no solutions. If a linear system is consistent, it has either

a unique solution (that is, precisely one solution) or
infinitely many solutions.

Thus, a system of linear equations cannot have, say, exactly three solutions.

For a linear system with two equations and two variables, the lines in the plane, or 2-space, intersect at one point as in FIGURE 8.2.1(a) (unique solution), are identical as in Figure 8.2.1(b) (infinitely many solutions), or are parallel as in Figure 8.2.1(c) (no solutions).

Three graphs. Part (a) labeled consistent, has the intersection of two lines graphed in an x y coordinate plane. The first line enters the bottom right in the fourth quadrant goes up and right and exits the top right in the first quadrant. The second line enters the top left in the second quadrant goes down and right through the first quadrant and exits the bottom right in the fourth quadrant. Both lines intersect at a point in the first quadrant. Part (b) labeled consistent, has one line graphed in an x y coordinate plane. The line enters bottom left in the fourth quadrant goes up and right through the fourth quadrant and exits the top right in the first quadrant. Part (c) labeled inconsistent, has two lines graphed in an x y coordinate plane. The first line enters the bottom left in the third quadrant goes up and right through the second quadrant and exits the top right in the first quadrant. The second line enters the top left in the second quadrant goes up and right and exits the top right in the first quadrant. Both lines are parallel. — FIGURE 8.2.1 A linear system of two equations in two variables interpreted as lines in 2-space

For a linear system with three equations and three variables, each equation in the system represents a plane in 3-space. FIGURE 8.2.2 shows some of the many ways that this kind of linear system can be interpreted.

Six graphs. Part (a) labeled consistent has the intersection of three planes all perpendicular to one another in one point. The point is indicated by an arrow and labeled point of intersection. Part (b) labeled consistent has the intersection on two planes perpendicular to one another. The intersection forms a line that is indicated by an arrow and labeled line of intersection. Part (c) labeled consistent has three planes having one edge in common. The line forming the common edge is indicated by an arrow and labeled line of intersection. Part (d) labeled inconsistent has three parallel planes with no points in common. Part (e) labeled inconsistent has the intersection of three planes. The first plane is horizontal. The second plane is vertical leaning to the left. The third plane is vertical leaning to the right. The two vertical planes share a common edge at the top. The intersection of each plane with the other forms a line and is indicated by arrows that highlight that there is no single line of intersection. Part (f) labeled inconsistent has the intersection of two parallel planes with one. The first plane is vertical, the second and third planes are horizontal and intersect the first plane at two different places forming lines of intersection. The lines are indicated by arrows and highlight that there is no single line of intersection. — FIGURE 8.2.2 A linear system of three equations in three variables interpreted as planes in 3-space

EXAMPLE 1 Verification of a Solution

Verify that x₁ = 14 + 7t, x₂ = 9 + 6t, x₃ = t, where t is any real number, is a solution of the system

SOLUTION

Replacing x₁, x₂, and x₃ in turn by 14 + 7t, 9 + 6t, and t, we have

For each real number t we obtain a different solution of the system; in other words, the system has an infinite number of solutions. For instance, t = 0, t = 4, and t = −2 give the three solutions

and

respectively. Geometrically, each equation in the system represents a plane in R³. In this case, the planes intersect in a line as shown in Figure 8.2.2(b). Parametric equations of the line are x₁ = 14 + 7t, x₂ = 9 + 6t, x₃ = t. The solution can also be written as the ordered triple (x₁, x₂, x₃) or (14 + 7t, 9 + 6t, t). ≡

Solving Systems

We can transform a system of linear equations into an equivalent system (that is, one having the same solutions) using the following elementary operations:

(i) Multiply an equation by a nonzero constant.

(ii) Interchange the positions of equations in the system.

(iii) Add a nonzero multiple of one equation to any other equation.

As the next example will show, these elementary operations enable us to systematically eliminate variables from the equations of the system.

EXAMPLE 2 Solving a Linear System

Solve

SOLUTION

We begin by interchanging the first and second rows:

Our goal now is to eliminate x₁ from the second and third equations. If we add to the second equation −2 times the first equation, we obtain the equivalent system

By adding to the third equation −5 times the first equation, we get a new equivalent system:

We are now going to use the second equation to eliminate the variable x₂ from the first and third equations. To simplify matters, let us multiply the second equation by :

Adding to the first equation −2 times the second equation yields

Next, by adding 3 times the second equation to the third equation we get

We shall use the last equation to eliminate the variable x₃ from the first and second equations. To this end, we multiply the third equation by :

At this point we could use back-substitution; that is, substitute the value x₃ = 5 back into the remaining equations to determine x₁ and x₂. However, by continuing with our systematic elimination, we add to the second equation − times the third equation:

Finally, by adding to the first equation 4 times the third equation, we obtain

It is now apparent that x₁ = 10, x₂ = −3, x₃ = 5 is the solution of the original system. The answer written as the ordered triple (10, −3, 5) means that the planes represented by the three equations in the system intersect at a point as in Figure 8.2.2(a). ≡

Augmented Matrix

Reflecting on the solution of the linear system in Example 2 should convince you that the solution of the system does not depend on what symbols are used as variables. Thus, the systems

have the same solution as the system in Example 2. In other words, in the solution of a linear system, the symbols used to denote the variables are immaterial; it is the coefficients of the variables and the constants that determine the solution of the system. In fact, we can solve a system of form (1) by dropping the variables entirely and performing operations on the rows of the array of coefficients and constants:

(2)

This array is called the augmented matrix of the system or simply the matrix of the system (1).

EXAMPLE 3 Augmented Matrices

(a) The augmented matrix represents the linear system

(b) The linear system

Thus the matrix of the system is

Elementary Row Operations

Since the rows of an augmented matrix represent the equations in a linear system, the three elementary operations on a linear system listed previously are equivalent to the following elementary row operations on a matrix:

Multiply a row by a nonzero constant.
Interchange any two rows.
Add a nonzero multiple of one row to any other row.

Of course, when we add a multiple of one row to another, we add the corresponding entries in the rows. We say that two matrices are row equivalent if one can be obtained from the other through a sequence of elementary row operations. The procedure of carrying out elementary row operations on a matrix to obtain a row-equivalent matrix is called row reduction.

Elimination Methods

To solve a system such as (1) using an augmented matrix, we shall use either Gaussian elimination or the Gauss–Jordan elimination method. In the former method, we row-reduce the augmented matrix of the system until we arrive at a row-equivalent augmented matrix in row-echelon form:

(i) The first nonzero entry in a nonzero row is a 1.

(ii) In consecutive nonzero rows, the first entry 1 in the lower row appears to the right of the 1 in the higher row.

(iii) Rows consisting of all zeros are at the bottom of the matrix.

In the Gauss–Jordan method, the row operations are continued until we obtain an augmented matrix that is in reduced row-echelon form. A reduced row-echelon matrix has the same three properties listed previously, but in addition:

(iv) A column containing a first entry 1 has zeros everywhere else.

EXAMPLE 4 Echelon Forms

(a) The augmented matrices

are in row-echelon form. The reader should verify that the three criteria for this form are satisfied.

(b) The augmented matrices

are in reduced row-echelon form. Note that the remaining entries in the columns that contain a leading entry 1 are all zeros. ≡

It should be noted that in Gaussian elimination, we stop when we have obtained an augmented matrix in row-echelon form. In other words, by using different sequences of row operations, we may arrive at different row-echelon forms. This method then requires the use of back-substitution. In Gauss–Jordan elimination, we stop when we have obtained the augmented matrix in reduced row-echelon form. Any sequence of row operations will lead to the same augmented matrix in reduced row-echelon form. This method does not require back-substitution; the solution of the system will be apparent by inspection of the final matrix. In terms of the equations of the original system, our goal in both methods is simply to make the coefficient of x₁ in the first equation^* equal to one and then use multiples of that equation to eliminate x₁ from other equations. The process is repeated for the other variables.

Note: Row operations can lead to different row-echelon forms.

To keep track of the row operations used on an augmented matrix, we shall utilize the following notation:

EXAMPLE 5 Elimination Methods and Augmented Matrices

Solve the linear system in Example 2 using (a) Gaussian elimination and (b) Gauss–Jordan elimination.

SOLUTION

(a) Using row operations on the augmented matrix of the system, we obtain:

The last matrix is in row-echelon form and represents the system

Substituting x₃ = 5 into the second equation gives x₂ = −3. Substituting both these values back into the first equation finally yields x₁ = 10.

(b) We start with the last matrix above. Since the first entries in the second and third rows are ones, we must, in turn, make the remaining entries in the second and third columns zeros:

The last matrix is in reduced row-echelon form. Bearing in mind what the matrix means in terms of equations, we see that the solution of the system is x₁ = 10, x₂ = −3, x₃ = 5. ≡

EXAMPLE 6 Gauss–Jordan Elimination

Use Gauss–Jordan elimination to solve

SOLUTION

Row operations give

In this case, the last matrix in reduced row-echelon form implies that the original system of three equations in three variables is really equivalent to two equations in the variables. Since only x₃ is common to both equations (the nonzero rows), we can assign its values arbitrarily. If we let x₃ = t, where t represents any real number, then we see that the system has infinitely many solutions: x₁ = 2 − t, x₂ = −3 + t, x₃ = t. Geometrically, these equations are the parametric equations for the line of intersection of the planes x₁ + 0x₂ + x₃ = 2 and 0x₁ + x₂ − x₃ = −3. ≡

EXAMPLE 7 Inconsistent System

Solve

SOLUTION

In the process of applying Gauss–Jordan elimination to the matrix of the system, we stop at

The third row of the last matrix means 0x₁ + 0x₂ = 16 (or 0 = 16). Since no numbers x₁ and x₂ can satisfy this equation, we conclude that the system has no solution. ≡

Inconsistent systems of m linear equations in n variables will always yield the situation illustrated in Example 7; that is, there will be a row in the reduced row-echelon form of the augmented matrix in which the first n entries are zero and the (n + 1)st entry is nonzero.

Worth remembering.

Networks

The currents in the branches of an electrical network can be determined by using Kirchhoff’s point and loop rules:

Point rule: The algebraic sum of the currents toward any branch point is 0.

Loop rule: The algebraic sum of the potential differences in any loop is 0.

When a loop is traversed in a chosen direction (clockwise or counterclockwise), an emf is taken to be positive when it is traversed from − to + and negative when traversed from + to −. An iR product is taken to be positive if the chosen direction through the resistor is opposite that of the assumed current, and negative if the chosen direction is in the same direction as the assumed current.

In FIGURE 8.2.3, the branch points of the network are labeled A and B, the loops are labeled L₁ and L₂, and the chosen direction in each loop is clockwise. Now, applying the foregoing rules to the network yields the nonhomogeneous system of linear equations

(3)

An electric network consists of the following components: a battery E, a resistor R subscript 1, and 2 resistors R subscript 2 and R subscript 3 that are in parallel. The flow of current is indicated by 2 loops and is in the clockwise direction. The current coming out of the battery E is indicated by a small arrow and labeled i subscript 1. The current passes through the resistor R subscript 1 and reaches a branch point in the circuit labeled A. At that point, the current splits into two: i subscript 2 that passes through the R subscript 2 resistor, and i subscript 3 that passes through the R subscript 3 resistor. Both i subscript 2 and i subscript 3 are indicated by a small arrow. After passing through the resistors R subscript 2 and R subscript 3 respectively the current i subscript 2 and i subscript 3 join back at a branch point labeled B and return into the battery. — FIGURE 8.2.3 Electrical network

EXAMPLE 8 Currents in a Network

Use Gauss–Jordan elimination to solve the system (3) when R₁ = 10 ohms, R₂ = 20 ohms, R₃ = 10 ohms, and E = 12 volts.

SOLUTION

The system to be solved is

In this case, Gauss–Jordan elimination yields

Hence, we see that the currents in the three branches are , , and . ≡

Homogeneous Systems

All the systems in the preceding examples are nonhomogeneous systems. As we have seen, a nonhomogeneous system can be consistent or inconsistent. By contrast, a homogeneous system of linear equations

(4)

is always consistent, since x₁ = 0, x₂ = 0, . . ., x_n = 0 will satisfy each equation in the system. The solution consisting of all zeros is called the trivial solution. But naturally we are interested in whether a system of form (4) has any solutions for which some of the x_i, i = 1, 2, . . ., n, are not zero. Such a solution is called a nontrivial solution. A homogeneous system either possesses only the trivial solution or possesses the trivial solution along with infinitely many nontrivial solutions. The next theorem, presented without proof, will give us a sufficient condition for the existence of nontrivial solutions.

THEOREM 8.2.1 Existence of Nontrivial Solutions

A homogeneous system of form (4) possesses nontrivial solutions if the number m of equations is less than the number n of variables (m < n).

EXAMPLE 9 Solving a Homogeneous System

Solve

SOLUTION

Since the number of equations is less than the number of variables, we know from Theorem 8.2.1 that the given system has nontrivial solutions. Using Gauss–Jordan elimination, we find

As in Example 6, if x₃ = t, then the solution of the system is . Note we obtain the trivial solution x₁ = 0, x₂ = 0, x₃ = 0 for this system by choosing t = 0. For t ≠ 0 we get nontrivial solutions. For example, the solutions corresponding to t = 6, t = −12, and t = 3 are, in turn, x₁ = 5, x₂ = 7, x₃ = 6; x₁ = −10, x₂ = −14, x₃ = −12; and x₁ = , x₂ = , x₃ = 3. ≡

Chemical Equations

The next example will give an application of homogeneous systems in chemistry.

EXAMPLE 10 Balancing a Chemical Equation

Balance the chemical equation C₂H₆ + O₂ → CO₂ + H₂O.

SOLUTION

We want to find positive integers x₁, x₂, x₃, and x₄ so that

x₁C₂H₆ + x₂O₂ → x₃CO₂ + x₄H₂O.

Because the number of atoms of each element must be the same on each side of the last equation, we have

Gauss–Jordan elimination gives

and so x₁ = t, x₂ = t, x₃ = t, x₄ = t. In this case t must be a positive integer chosen in such a manner so that x₁, x₂, and x₃ are positive integers. To accomplish this we pick t = 6. This gives x₁ = 2, x₂ = 7, x₃ = 4, x₄ = 6. The balanced chemical equation is then

2C₂H₆ + 7O₂ → 4CO₂ + 6H₂O. ≡

Notation

In view of matrix multiplication and equality of matrices defined in Section 8.1, if we define matrices A, X, and B as

then observe that the linear system (1) can be written as the matrix equation

or more compactly as

AX = B,

where the matrix A is, naturally, called the coefficient matrix. The augmented matrix of a system AX = B is often denoted by (A | B). A homogeneous system of equations is written AX = 0. This notation makes it easier to discuss linear systems and to prove theorems for such systems.

THEOREM 8.2.2 Two Properties of Homogeneous Systems

Let AX = 0 denote a homogeneous system of linear equations.

(i) If X₁ is a solution of AX = 0, then so is cX₁ for any constant c.

(ii) If X₁ and X₂ are solutions of AX = 0, then so is X₁ + X₂.

PROOF:

(i) Because X₁ is a solution, then AX₁ = 0. Now

A(cX₁) = c(AX₁) = c0 = 0.

This shows that a constant multiple of a solution of a homogeneous linear system is also a solution.

(ii) Because X₁ and X₂ are solutions, then AX₁ = 0 and AX₂ = 0. Now

A(X₁ + X₂) = AX₁ + AX₂ = 0 + 0 = 0.

This shows that X₁ + X₂ is a solution. ≡

By combining parts (i) and (ii) of Theorem 8.2.2 we can say that if X₁ and X₂ are solutions of AX = 0, then so is the linear combination

c₁X₁ + c₂X₂,

where c₁ and c₂ are constants. Moreover, this superposition principle extends to three or more solutions of AX = 0.

EXAMPLE 11 Example 9 Revisited

At the end of Example 9, we obtained three distinct solutions

of the given homogeneous linear system. It follows from the preceding discussion that

is also a solution of the system. Verify this. ≡

Terminology

Suppose a linear system has m equations and n variables. If there are more equations than variables, that is, m > n, then the system is said to be overdetermined. If the system has fewer equations than variables, that is, m < n, then the system is called underdetermined. An overdetermined linear system may put too many constraints on the variables and so is usually—not always—inconsistent. The system in Example 7 is overdetermined and is inconsistent. On the other hand, an underdetermined system is usually—but not always—consistent. The systems in Examples 9 and 10 are underdetermined and are consistent. It should be noted that it is impossible for a consistent underdetermined system to possess a single or unique solution. To see this, suppose that m < n. If Gaussian elimination is used to solve the system, then the row-echelon form that is row equivalent to the matrix of the system will contain r nonzero rows where r ≤ m < n. Thus we can solve for r of the variables in terms of n − r > 0 variables or parameters. If the underdetermined system is consistent, then those remaining n − r variables can be chosen arbitrarily and so the system has an infinite number of solutions.

REMARKS

(i) To solve large systems of linear equations, we obviously need the help of a computer. Since the two methods in this section are so systematic, they can be programmed easily. However, the requirement that each nonzero row start with a one may at times demand division by a very small number. Problems can then occur. Large systems are often solved indirectly—that is, by an approximation technique such as Gauss–Seidel iteration. See Section 16.1.

(ii) Since Gauss–Jordan elimination avoids the necessity of back-substitution, it would appear that it is the more efficient of the two methods we have considered. Actually, this is not the case. It has been shown that for large systems, Gauss–Jordan elimination can require about 50% more operations than Gaussian elimination.

(iii) A consistent nonhomogeneous linear system AX = B, B ≠ 0, shares a property with nonhomogeneous linear differential equations. If X_h is a solution of the associated homogeneous system AX = 0, and X_p is a particular solution of the nonhomogeneous system AX = B, then the superposition X_h + X_p is also a solution of the nonhomogeneous system. This is readily verified by the distributive law of matrix multiplication:

A(X_h + X_p) = AX_h + AX_p = 0 + B = B.

(iv) The set S of all solution vectors X of a homogeneous linear system with m equations and n variables,

S = {X| AX = 0}

is called the nullspace of the matrix A. If you have studied Section 7.6, then S is a subspace of the vector space Rⁿ. See pages 361 and 362. This fact follows immediately from Theorems 7.6.1 and 8.2.2.

8.2 Exercises Answers to selected odd-numbered problems begin on page ANS-18.

In Problems 1–20, use either Gaussian elimination or Gauss–Jordan elimination to solve the given system or show that no solution exists.

In Problems 21 and 22, use a calculator to solve the given system.

In Problems 23–28, use the procedures illustrated in Example 10 to balance the given chemical equation.

Na + H₂O → NaOH + H₂
KClO₃ → KCl + O₂
Fe₃O₄ + C → Fe + CO
C₅H₈ + O₂ → CO₂ + H₂O
Cu + HNO₃ → Cu(NO₃)₂ + H₂O + NO
Ca₃(PO₄)₂ + H₃PO₄ → Ca(H₂PO₄)₂

In Problems 29 and 30, set up and solve the system of equations for the currents in the branches of the given network.

FIGURE 8.2.4 Network in Problem 29
FIGURE 8.2.5 Network in Problem 30

In Problems 31 and 32, write the homogeneous system of linear equations in the form . Then verify by matrix multiplication that the given matrix is a solution of the system for any real number .

In Problems 33 and 34, write the nonhomogeneous system of linear equations in the form . Then verify by matrix multiplication that the given matrix is a solution of the system for any real numbers and .

An elementary matrix E is one obtained by performing a single row operation on the identity matrix I. In Problems 35−38, verify that the given matrix is an elementary matrix.

If a matrix A is premultiplied by an elementary matrix E, the product EA will be that matrix obtained from A by performing the elementary row operation symbolized by E. In Problems 39−42, compute the given product for an arbitrary 3 × 3 matrix A.

Computer Lab Assignments

In Problems 43–46, use a CAS to solve the given system.

*We can always interchange equations so that the first equation contains the variable x₁.