# Row pivoting

## Contents

# 2.6. Row pivoting¶

As mentioned in Section 2.4, the \(\mathbf{A}=\mathbf{L}\mathbf{U}\) factorization is not stable for every nonsingular \(\mathbf{A}\). Indeed, the factorization does not always even exist.

Here is a previously encountered matrix, which factors well.

```
A = [2 0 4 3 ; -4 5 -7 -10 ; 1 15 2 -4.5 ; -2 0 2 -13];
L,U = FNC.lufact(A)
L
```

```
4×4 LowerTriangular{Float64, Matrix{Float64}}:
1.0 ⋅ ⋅ ⋅
-2.0 1.0 ⋅ ⋅
0.5 3.0 1.0 ⋅
-1.0 0.0 -2.0 1.0
```

If we swap the second and fourth rows of \(\mathbf{A}\), the result is still nonsingular. However, the factorization now fails.

```
A[[2,4],:] = A[[4,2],:]
L,U = FNC.lufact(A)
L
```

```
4×4 LowerTriangular{Float64, Matrix{Float64}}:
1.0 ⋅ ⋅ ⋅
-1.0 NaN ⋅ ⋅
0.5 Inf NaN ⋅
-2.0 Inf NaN 1.0
```

The presence of `NaN`

in the result indicates that some impossible operation was required. The source of the problem is easy to locate. We can find the first outer product in the factorization just fine:

```
U[1,:] = A[1,:]
L[:,1] = A[:,1]/U[1,1]
A -= L[:,1]*U[1,:]'
```

```
4×4 Matrix{Float64}:
0.0 0.0 0.0 0.0
0.0 0.0 6.0 -10.0
0.0 15.0 0.0 -6.0
0.0 5.0 1.0 -4.0
```

The next step is `U[2,:]=A[2,:]`

, which is also OK. But then we are supposed to divide by `U[2,2]`

, which is zero. The algorithm cannot continue.

In Section 2.4 we remarked that LU factorization is equivalent to Gaussian elimination with no row swaps. However, those swaps are necessary in situations like those encountered in Demo 2.6.1, in order to avoid division by zero. We will find a modification of the outer product procedure that allows us to do the same thing.

## Choosing a pivot¶

The diagonal element of \(\mathbf{U}\) that appears in the denominator of line 17 of Function 2.4.6 is called the **pivot element** of its column. In order to avoid a zero pivot, we will use the largest available element in the column we are working on as the pivot. This technique is known as **row pivoting**.

When performing elimination in column \(j\), choose as the pivot the element in column \(j\) that is largest in absolute value. (In case of ties, choose the lowest row index.)

Here is the trouble-making matrix from Demo 2.6.1.

```
A₁ = [2 0 4 3 ; -2 0 2 -13; 1 15 2 -4.5 ; -4 5 -7 -10]
```

```
4×4 Matrix{Float64}:
2.0 0.0 4.0 3.0
-2.0 0.0 2.0 -13.0
1.0 15.0 2.0 -4.5
-4.0 5.0 -7.0 -10.0
```

We now find the largest candidate pivot in the first column. We don’t care about sign, so we take absolute values before finding the max.

```
i = argmax( abs.(A₁[:,1]) )
```

```
4
```

This is the row of the matrix that we extract to put into \(\mathbf{U}\). That guarantees that the division used to find \(\boldsymbol{\ell}_1\) will be valid.

```
L,U = zeros(4,4),zeros(4,4)
U[1,:] = A₁[i,:]
L[:,1] = A₁[:,1]/U[1,1]
A₂ = A₁ - L[:,1]*U[1,:]'
```

```
4×4 Matrix{Float64}:
0.0 2.5 0.5 -2.0
0.0 -2.5 5.5 -8.0
0.0 16.25 0.25 -7.0
0.0 0.0 0.0 0.0
```

Observe that \(\mathbf{A}_2\) has a new zero row and zero column, but the zero row is the fourth rather than the first. However, we forge on by using the largest possible pivot in column 2 for the next outer product.

```
@show i = argmax( abs.(A₂[:,2]) )
U[2,:] = A₂[i,:]
L[:,2] = A₂[:,2]/U[2,2]
A₃ = A₂ - L[:,2]*U[2,:]'
```

```
i = argmax(abs.(A₂[:, 2])) = 3
```

```
4×4 Matrix{Float64}:
0.0 0.0 0.461538 -0.923077
0.0 0.0 5.53846 -9.07692
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
```

Now we have zeroed out the third row as well as the second column. We can finish out the procedure.

```
@show i = argmax( abs.(A₃[:,3]) )
U[3,:] = A₃[i,:]
L[:,3] = A₃[:,3]/U[3,3]
A₄ = A₃ - L[:,3]*U[3,:]'
```

```
i = argmax(abs.(A₃[:, 3])) = 2
```

```
4×4 Matrix{Float64}:
0.0 0.0 0.0 -0.166667
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
```

```
@show i = argmax( abs.(A₄[:,4]) )
U[4,:] = A₄[i,:]
L[:,4] = A₄[:,4]/U[4,4];
```

```
i = argmax(abs.(A₄[:, 4])) = 1
```

We do have a factorization of the original matrix:

```
A₁ - L*U
```

```
4×4 Matrix{Float64}:
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
```

And \(\mathbf{U}\) has the required structure:

```
U
```

```
4×4 Matrix{Float64}:
-4.0 5.0 -7.0 -10.0
0.0 16.25 0.25 -7.0
0.0 0.0 5.53846 -9.07692
0.0 0.0 0.0 -0.166667
```

However, the triangularity of \(\mathbf{L}\) has been broken.

```
L
```

```
4×4 Matrix{Float64}:
-0.5 0.153846 0.0833333 1.0
0.5 -0.153846 1.0 -0.0
-0.25 1.0 0.0 -0.0
1.0 0.0 0.0 -0.0
```

We will return to the loss of triangularity in \(\mathbf{L}\) momentarily. First, though, there is a question left to answer: what if at some stage, all the elements of the targeted column are zero, i.e., there are no available pivots? Fortunately that loose end ties up nicely, although a proof is a bit beyond our scope here.

The row-pivoted LU factorization runs to completion if and only if the original matrix is invertible.

A linear system with a singular matrix has either no solution or infinitely many solutions. Either way, a technique other than LU factorization is needed to handle it.

## Permutations¶

Even though the resulting \(\mathbf{L}\) in Demo 2.6.3 is no longer of unit lower triangular form, it is close. In fact, all that is needed is to reverse the order of its rows.

Here again is the matrix from Demo 2.6.3.

```
A = [2 0 4 3 ; -2 0 2 -13; 1 15 2 -4.5 ; -4 5 -7 -10]
```

```
4×4 Matrix{Float64}:
2.0 0.0 4.0 3.0
-2.0 0.0 2.0 -13.0
1.0 15.0 2.0 -4.5
-4.0 5.0 -7.0 -10.0
```

As the factorization proceeded, the pivots were selected from rows 4, 3, 2, and finally 1. If we were to put the rows of \(\mathbf{A}\) into that order, then the algorithm would run exactly like the plain LU factorization from Section 2.4.

```
B = A[ [4,3,2,1], : ]
L,U = FNC.lufact(B);
```

We obtain the same \(\mathbf{U}\) as before:

```
U
```

```
4×4 UpperTriangular{Float64, Matrix{Float64}}:
-4.0 5.0 -7.0 -10.0
⋅ 16.25 0.25 -7.0
⋅ ⋅ 5.53846 -9.07692
⋅ ⋅ ⋅ -0.166667
```

And \(\mathbf{L}\) has the same rows as before, but arranged into triangular order:

```
L
```

```
4×4 LowerTriangular{Float64, Matrix{Float64}}:
1.0 ⋅ ⋅ ⋅
-0.25 1.0 ⋅ ⋅
0.5 -0.153846 1.0 ⋅
-0.5 0.153846 0.0833333 1.0
```

In principle, if the permutation of rows implied by the pivot locations is applied all at once to the original \(\mathbf{A}\), no further pivoting is needed. In practice, this permutation cannot be determined immediately from the original \(\mathbf{A}\); the only way to find it is to run the algorithm. Having obtained it at the end, though, we can use it to state a simple relationship.

Given \(n\times n\) matrix \(\mathbf{A}\), the **PLU factorization** is a unit lower triangular \(\mathbf{L}\), an upper triangular \(\mathbf{U}\), and a permutation \(i_1,\ldots,i_n\) of the integers \(1,\ldots,n\), such that

where rows \(1,\ldots,n\) of \(\tilde{\mathbf{A}}\) are rows \(i_1,\ldots,i_n\) of \(\mathbf{A}\).

Function 2.6.7 shows our implementation of PLU factorization.1

**Row=pivoted LU factorization**

```
1"""
2 plufact(A)
3
4Compute the PLU factorization of square matrix `A`, returning the
5triangular factors and a row permutation vector.
6"""
7function plufact(A)
8 n = size(A,1)
9 L = zeros(n,n)
10 U = zeros(n,n)
11 p = fill(0,n)
12 Aₖ = float(copy(A))
13
14 # Reduction by outer products
15 for k in 1:n-1
16 p[k] = argmax(abs.(Aₖ[:,k]))
17 U[k,:] = Aₖ[p[k],:]
18 L[:,k] = Aₖ[:,k]/U[k,k]
19 Aₖ -= L[:,k]*U[k,:]'
20 end
21 p[n] = argmax(abs.(Aₖ[:,n]))
22 U[n,n] = Aₖ[p[n],n]
23 L[:,n] = Aₖ[:,n]/U[n,n]
24 return LowerTriangular(L[p,:]),U,p
25end
```

Ideally, the PLU factorization takes \(\sim \frac{2}{3}n^3\) flops asymptotically, just like LU without pivoting. The implementation in Function 2.6.7 does not achieve this optimal flop count, however. Like Function 2.4.6, it does unnecessary operations on structurally known zeros for the sake of being easier to understand.

## Linear systems¶

The output of Function 2.6.7 is a factorization of a row-permuted \(\mathbf{A}\). Therefore, given a linear system \(\mathbf{A}\mathbf{x}=\mathbf{b}\), we have to permute \(\mathbf{b}\) the same way before applying forward and backward substitution. This is equivalent to changing the order of the equations in a linear system, which does not affect its solution.

The third output of `plufact`

is the permutation vector we need to apply to \(\mathbf{A}\).

```
A = rand(1:20,4,4)
L,U,p = FNC.plufact(A)
A[p,:] - L*U # should be ≈ 0
```

```
4×4 Matrix{Float64}:
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
```

Given a vector \(\mathbf{b}\), we solve \(\mathbf{A}\mathbf{x}=\mathbf{b}\) by first permuting the entries of \(\mathbf{b}\) and then proceeding as before.

```
b = rand(4)
z = FNC.forwardsub(L,b[p])
x = FNC.backsub(U,z)
```

```
4-element Vector{Float64}:
-0.7851426918685646
-0.45594039945095327
0.4835518542903485
0.37418228257406316
```

A residual check is successful:

```
b - A*x
```

```
4-element Vector{Float64}:
2.220446049250313e-16
5.551115123125783e-16
-7.771561172376096e-16
-5.551115123125783e-16
```

The `lu`

function from the built-in package `LinearAlgebra`

returns the same three outputs as Function 2.6.7. If you only request one output, it will be a factorization object that can be used with a backslash. This is useful when you want to solve with multiple versions of \(\mathbf{b}\) but do the factorization only once.

With the syntax `A\b`

, the matrix `A`

is PLU-factored, followed by two triangular solves.

```
A = randn(500,500) # 500x500 with normal random entries
A\rand(500) # force compilation
@elapsed for k=1:50; A\rand(500); end
```

```
2.868909833
```

In Section 2.5 we showed that the factorization is by far the most costly part of the solution process. A factorization object allows us to do that costly step only once per unique matrix.

```
factored = lu(A) # store factorization result
factored\rand(500) # force compilation
@elapsed for k=1:50; factored\rand(500); end
```

```
0.013718708
```

## Stability¶

There is one detail of the row pivoting algorithm that might seem arbitrary: why choose the pivot of largest magnitude in a column, rather than, say, the uppermost nonzero in the column? The answer is numerical stability.

Let

If \(\epsilon=0\), LU factorization without pivoting fails for \(\mathbf{A}\). But if \(\epsilon\neq 0\), we can go without pivoting, at least in principle.

We construct a linear system for this matrix with \(\epsilon=10^{-12}\) and exact solution \([1,1]\):

```
ϵ = 1e-12
A = [-ϵ 1;1 -1]
b = A*[1,1]
```

```
2-element Vector{Float64}:
0.999999999999
0.0
```

We can factor the matrix without pivoting and solve for \(\mathbf{x}\).

```
L,U = FNC.lufact(A)
x = FNC.backsub( U, FNC.forwardsub(L,b) )
```

```
2-element Vector{Float64}:
0.9999778782798785
1.0
```

Note that we have obtained only about 5 accurate digits for \(x_1\). We could make the result even more inaccurate by making \(\epsilon\) even smaller:

```
ϵ = 1e-20; A = [-ϵ 1;1 -1]
b = A*[1,1]
L,U = FNC.lufact(A)
x = FNC.backsub( U, FNC.forwardsub(L,b) )
```

```
2-element Vector{Float64}:
-0.0
1.0
```

This effect is not due to ill conditioning of the problem—a solution with PLU factorization works perfectly:

```
A\b
```

```
2-element Vector{Float64}:
1.0
1.0
```

The factors of this \(\mathbf{A}\) without pivoting are found to be

For reasons we will quantify in Section 2.8, the solution of \(\mathbf{A}\mathbf{x}=\mathbf{b}\) is well-conditioned, but the problems of solving \(\mathbf{L}\mathbf{z}=\mathbf{b}\) and \(\mathbf{U}\mathbf{x}=\mathbf{z}\) have condition numbers essentially \(1/\epsilon^2\) each. Thus, for small \(\epsilon\), solution of the original linear system by unpivoted LU factorization is highly unstable.

Somewhat surprisingly, solving \(\mathbf{A}\mathbf{x}=\mathbf{b}\) via PLU factorization is technically also unstable. In fact, examples of unstable solutions are well-known, but they have been nonexistent in practice. While there is a lot of evidence and some reasoning about why this is the case, the situation is not completely understood. Yet PLU factorization remains the algorithm of choice for general linear systems.

## Exercises¶

✍ Perform by hand the pivoted LU factorization of each matrix.

**(a)**\(\quad \displaystyle \begin{bmatrix} 2 & 3 & 4 \\ 4 & 5 & 10 \\ 4 & 8 & 2 \end{bmatrix},\qquad\)**(b)**\(\quad \displaystyle \begin{bmatrix} 1 & 4 & 5 & -5 \\ -1 & 0 & -1 & -5 \\ 1 & 3 & -1 & 2 \\ 1 & -1 & 5 & -1 \end{bmatrix}\).✍ Let \(\mathbf{A}\) be a square matrix and \(\mathbf{b}\) be a column vector of compatible length. Here is correct Julia code to solve \(\mathbf{A}\mathbf{x}=\mathbf{b}\):

L,U,p = lu(A) x = U \ (L\b[p])

Suppose instead you replace the last line above with

x = U \ L \ b[p]

Mathematically in terms of \(\mathbf{L}\), \(\mathbf{U}\), \(\mathbf{p}\), and \(\mathbf{b}\), what vector is found?

✍ Suppose that

`A`

is a \(4\times 6\) matrix in Julia and you defineB = A[end:-1:1,end:-1:1]

Show that \(\mathbf{B} = \mathbf{P} \mathbf{A} \mathbf{Q}\) for certain matrices \(\mathbf{P}\) and \(\mathbf{Q}\).

✍ An \(n\times n\)

*permutation matrix*\(\mathbf{P}\) is a reordering of the rows of an identity matrix such that \(\mathbf{P} \mathbf{A}\) has the effect of moving rows \(1,2,\ldots,n\) of \(\mathbf{A}\) to new positions \(i_1,i_2,\ldots,i_n\). Then \(\mathbf{P}\) can be expressed as\[\mathbf{P} = \mathbf{e}_{i_1}\mathbf{e}_1^T + \mathbf{e}_{i_2}\mathbf{e}_2^T + \cdots + \mathbf{e}_{i_n}\mathbf{e}_n^T.\]**(a)**For the case \(n=4\) and \(i_1=3\), \(i_2=2\), \(i_3=4\), \(i_4=1\), write out separately, as matrices, all four of the terms in the sum. Then add them together to find \(\mathbf{P}\).**(b)**Use the formula in the general case to show that \(\mathbf{P}^{-1}=\mathbf{P}^T\).

- 1
Because unpivoted LU factorization is not useful, in practice the term

*LU factorization*mostly refers to pivoted LU.

The

`argmax`

function returns the location of the largest element of a vector or matrix.