{ "cells": [ { "cell_type": "code", "execution_count": 1, "id": "05214b8a", "metadata": { "tags": [ "remove-cell" ] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "┌ Info: verify download of index files...\n", "└ @ MatrixDepot /Users/driscoll/.julia/packages/MatrixDepot/GEDc3/src/MatrixDepot.jl:139\n", "┌ Info: reading database\n", "└ @ MatrixDepot /Users/driscoll/.julia/packages/MatrixDepot/GEDc3/src/download.jl:23\n", "┌ Info: adding metadata...\n", "└ @ MatrixDepot /Users/driscoll/.julia/packages/MatrixDepot/GEDc3/src/download.jl:67\n", "┌ Info: adding svd data...\n", "└ @ MatrixDepot /Users/driscoll/.julia/packages/MatrixDepot/GEDc3/src/download.jl:69\n", "┌ Info: writing database\n", "└ @ MatrixDepot /Users/driscoll/.julia/packages/MatrixDepot/GEDc3/src/download.jl:74\n", "┌ Info: used remote sites are sparse.tamu.edu with MAT index and math.nist.gov with HTML index\n", "└ @ MatrixDepot /Users/driscoll/.julia/packages/MatrixDepot/GEDc3/src/MatrixDepot.jl:141\n" ] } ], "source": [ "using FundamentalsNumericalComputation\n", "FNC.init_format()" ] }, { "cell_type": "markdown", "id": "1a1c5d73", "metadata": {}, "source": [ "(section-krylov-minrescg)=\n", "# MINRES and conjugate gradients\n", "\n", "```{index} symmetric matrix\n", "```\n", "\n", "We have seen before that certain matrix properties enhance solutions to linear algebra problems. One of the most important of these is when $\\mathbf{A}^*=\\mathbf{A}$; i.e., $\\mathbf{A}$ is hermitian. The Arnoldi iteration has a particularly useful specialization to this case. While in this section we describe the resulting algorithms, we do not present them in detail or show implementations.\n", "\n", "## Lanczos iteration\n", "\n", "Starting from {eq}`arnoldimat`, we left-multiply by $\\mathbf{Q}_m^*$ to get\n", "\n", ":::{math}\n", "\\mathbf{Q}_m^* \\mathbf{A} \\mathbf{Q}_m = \\mathbf{Q}_m^* \\mathbf{Q}_{m+1} \\mathbf{H}_m = \\tilde{\\mathbf{H}}_m,\n", ":::\n", "\n", "where $\\tilde{\\mathbf{H}}_m$ is rows 1 through $m$ of $\\mathbf{H}_m$. If $\\mathbf{A}$ is hermitian, then so is the left side of this equation, hence $\\tilde{\\mathbf{H}}_m$ is hermitian too. But it is also upper Hessenberg, meaning that the $(i,j)$ element is zero if $i > j+1$. By symmetry, this means that elements are zero when $j > i+1$ as well.\n", "\n", "::::{proof:observation}\n", "For a hermitian (or real symmetric) matrix, the upper Hessenberg matrix $\\mathbf{H}_m$ produced by the Arnoldi iteration is tridiagonal.\n", "::::\n", "\n", "```{index} Arnoldi iteration\n", "```\n", "Equation {eq}`arnoldivec` of the Arnoldi iteration now simplifies to a much shorter expression:\n", "\n", ":::{math}\n", ":label: lanczos\n", "\\mathbf{A} \\mathbf{q}_m = H_{m-1,m} \\,\\mathbf{q}_{m-1} + H_{mm} \\,\\mathbf{q}_m + H_{m+1,m}\\,\\mathbf{q}_{m+1}.\n", ":::\n", "\n", "```{index} ! Lanczos iteration\n", "```\n", "\n", "As before in deriving the Arnoldi iteration, when given the first $m$ vectors we can solve for the entries in column $m$ of $\\mathbf{H}$ and then for $\\mathbf{q}_{m+1}$. The resulting process is known as the **Lanczos iteration**. Its most important practical advantage is that while Arnoldi needs $O(m)$ steps to get $\\mathbf{q}_{m+1}$ from the previous vectors, Lanczos needs only $O(1)$ steps, so restarting isn't required for symmetric matrices.[^lanczos]\n", "\n", "[^lanczos]: In principle, the implementation of Lanczos iteration is minor change from Arnoldi, but numerical stability requires some extra analysis and effort. We do not present the details.\n", "\n", "## MINRES\n", "\n", "```{index} GMRES; relationship to MINRES, ! MINRES\n", "```\n", "\n", "When $\\mathbf{A}$ is hermitian and the Arnoldi iteration is reduced to Lanczos, the analog of GMRES is known as **MINRES**. Like GMRES, MINRES minimizes the residual $\\|\\mathbf{b}-\\mathbf{A}\\mathbf{x}\\|$ over increasingly larger Krylov spaces.\n", "\n", "MINRES is also more theoretically tractable than GMRES. The following result relies on some advanced approximation theory. Recall that the eigenvalues of a hermitian matrix are real. \n", "\n", "(theorem-minrescg-indefinite)=\n", "::::{proof:theorem} Convergence of MINRES (indefinite case)\n", "Suppose $\\mathbf{A}$ is hermitian, invertible, and indefinite. Divide its eigenvalues into positive and negative sets $\\Lambda_+$ and $\\Lambda_-$, and define\n", "\n", "$$\n", "\\kappa_+ = \\frac{ \\max_{\\lambda \\in \\Lambda_+} |\\lambda| }{ \\min_{\\lambda \\in \\Lambda_+} |\\lambda| }, \\qquad\n", "\\kappa_- = \\frac{ \\max_{\\lambda \\in \\Lambda_-} |\\lambda| }{ \\min_{\\lambda \\in \\Lambda_-} |\\lambda| }.\n", "$$\n", "\n", "Then $\\mathbf{x}_m$, the $m$th solution estimate of MINRES, satisfies\n", "\n", ":::{math}\n", ":label: minres-conv\n", "\\frac{\\|\\mathbf{r}_m\\|_2}{\\|\\mathbf{b}\\|_2} \\le \\left( \\frac{\\sqrt{\\kappa_+\\kappa_-} - 1}{\\sqrt{\\kappa_+\\kappa_-} + 1} \\right)^{\\lfloor m/2\\rfloor},\n", ":::\n", "\n", "where $\\lfloor m/2\\rfloor$ means to round $m/2$ down to the nearest integer.\n", "::::\n", "\n", "```{index} convergence rate; linear\n", "```\n", "\n", "The bound for a definite matrix is better, as the next theorem shows. The upper bound {eq}`minres-conv` on the residual obeys a linear convergence rate. As the product $\\kappa_+\\kappa_-$ grows, the rate of this convergence approaches 1. Hence the presence of eigenvalues close to the origin (relative to the max eigenvalues) is expected to force a slower convergence. \n", "\n", "::::{proof:example}\n", "Suppose $\\mathbf{A}$ has $\\kappa_+=60$ and $\\kappa_-=15$. Then to achieve a guaranteed reduction in the relative residual of $10^{-3}$, we require\n", "\n", "$$\n", "\\left( \\frac{\\sqrt{900} - 1}{\\sqrt{900} + 1} \\right)^{\\lfloor m/2\\rfloor} \\le 10^{-3},\n", "$$\n", "\n", "$$\n", " {\\lfloor m/2\\rfloor} \\log_{10} \\left( \\frac{29}{31} \\right) \\le -3,\n", "$$\n", "\n", "$$\n", " m \\ge 2 \\lceil \\frac{3}{\\log_{10}(29/31)} \\rceil = 208.\n", "$$\n", "\n", "Because the theorem gives an upper bound, MINRES may converge faster. All we can say is that 208 is certain to be enough iterations.\n", "::::\n", "\n", "(demo-minrescg-indefinite)=\n", ":::{proof:demo}\n", ":::\n", "\n", "```{raw} html\n", "