This post is inspired by a Twitter conversation I had recently.
The Jacobi iteration method
The Jacobi iteration method (here I will describe it more generally) is a way to leverage perturbation theory to solve (numerically) (finite-dimensional) linear systems of equations. Suppose we wish to solve \begin{equation}\label{eq:lineq} \tilde{A}x = b \end{equation} where $\tilde{A}$ is some given square matrix (which we will assume to be invertible for the time being), $b$ is the intended source vector, and $x$ is the unknown vector. The computation of $\tilde{A}^{-1}$ from $\tilde{A}$ is tedious (and difficult, even for computers1). If one is
- happy enough with approximate solutions, and
- know how to invert a "nearby" matrix $A$,
then one can apply an iterative approach. The idea is pretty simple:
- Let $\alpha := A^{-1} (\tilde{A} - A)$.
- The equation \eqref{eq:lineq} is then equivalent to $(I + \alpha)x = A^{-1}b$.
- This can be rewritten as requiring $x$ to be the fixed point of the mapping $x \mapsto f(x) = A^{-1}b - \alpha x$.
By Banach's fixed point theorem, if we can establish that the function $f(x)$ is a contraction mapping, then by iterating with \[ x_0 = 0 , \qquad x_{i+1} = f(x_i) \] this sequence must converge to the solution of \eqref{eq:lineq}
Since $f$ is affine, whether it is a contraction or not depends only on the metric used and the value of $\alpha$, in particular, if we measure $x$ using a norm $X$, then what matters is whether the operator norm $\|\alpha\|_{X\to X}$ is less than 1.
In the typical formulation of Jacobi's iteration method, we take $A$ to be the diagonal part of $\tilde{A}$. The "strictly diagonally dominant" condition is equivalent to $\|\alpha\|_{X\to X} < 1$ after setting $X = \ell_\infty$.
Note that strict diagonal dominance is not required for the convergence. Suppose, for example, that $\alpha$ is nilpotent. Then the sequence generated is \[ x_0 = 0 \mapsto \beta:= A^{-1} b \mapsto \beta - \alpha \beta \mapsto \beta - \alpha \beta + \alpha^2 \beta \mapsto \cdots \] which will converge after finitely many step to $(I + \alpha)^{-1} \beta$.
Almost-contraction mappings
What happens when $\|\alpha\|_{X\to X} = 1$? In this setting, Banach's theorem cannot be used, and the sequence we construct are not guaranteed to converge. But lo, observe that the sequence we construct is actually the partial sums to the geometric series \[ \left(\sum_{j = 0}^\infty (-1)^j \alpha^j \right) \beta. \] And herein lies a hope: we can try to approach this using Abel summation. Choose some $\epsilon > 0$ (but hopefully very small), we can compute instead the quantity \begin{equation}\label{eq:xeps} x_{\epsilon} = \left( \sum_{j = 0}^\infty (\epsilon - 1)^j \alpha^j \right) \beta. \end{equation} Now we have an actual geometric series with $\| (\epsilon - 1) \alpha\| = |1-\epsilon| < 1$ which converges. The next task is to take the limit $\epsilon \to 0^+$. In the scalar case (consider the power series for $\frac{1}{1+(1-\epsilon)e^{i\theta}}$), taking the limit $\epsilon \to 0^+$ converges to $\frac{1}{1 + e^{i\theta}}$ except when $\theta = \pi$. This we can actually do using Jordan normal form.
First note that when $\epsilon > 0$ the expression \eqref{eq:xeps} for $x_\epsilon$ is absolutely convergent. Supposing that $\alpha$ has a Jordan normal form of the form $\alpha_1 \oplus \alpha_2 \oplus \cdots \alpha_k$, with corresponding eigenvalues $\lambda_1, \lambda_2, \ldots \lambda_k$, the series $ \sum (\epsilon - 1)^j \alpha^j$ converges to \begin{equation} \sum_{j = 0}^\infty (\epsilon -1)^j \alpha^j = F((1-\epsilon)\alpha_1) \oplus F((1-\epsilon)\alpha_2) \cdots F((1-\epsilon)\alpha_k) \end{equation} where the function $F(A) = (I + A)^{-1}$ can be expressed as \begin{equation} F((1-\epsilon)\alpha_\ell) = \begin{pmatrix} \frac{1}{1 + (1-\epsilon)\lambda_\ell} & \frac{-1}{[1 + (1-\epsilon)\lambda_\ell]^2} & \frac{1}{[1 + (1-\epsilon)\lambda_\ell]^3} & \frac{-1}{[1 + (1-\epsilon)\lambda_\ell]^4} & \cdots \newline 0 & \frac{1}{1 + (1-\epsilon)\lambda_\ell} & \frac{-1}{[1 + (1-\epsilon)\lambda_\ell]^2} & \frac{1}{[1 + (1-\epsilon)\lambda_\ell]^3} \newline 0 & 0 & \ddots \end{pmatrix} \end{equation} We observe that provided $\lambda_\ell \neq -1$, these matrices are bounded as $\epsilon \to 0^+$ and as $\epsilon \to 0$ converge to $(I + \alpha_\ell)^{-1}$.
Hence, in the case we are interested in an approximate solution to $(I + \alpha)x = \beta$, we can do so by choosing an $\epsilon$ small and an $N$ large and computing $\sum_{j =0}^N (\epsilon- 1)^j \alpha^j \beta$, provided that $-1$ is not an eigenvalue of $\alpha$.
Singular case
Suppose now we are in a situation where$(I + \alpha)$ is not invertible. What can we say about an attempt to solve the linear system using Jacobi's method? Suppose $\beta$ is in the kernel of $I + \alpha$, then we have $\alpha \beta = - \beta$. This means our iteration scheme gives \[ 0 \mapsto \beta \mapsto 2\beta \mapsto 3\beta \ldots \] and the sequence never converges. However, when $(I+\alpha)$ has non-trivial kernel we also know that the solution is guaranteed to be non-unique. And so we can try to regularize by picking representatives, by adding or subtracting elements in the kernel of $(I + \alpha)$.
One way to do so is to define a projection operator (non-unique, by the way) $\pi$ such that
- $\pi^2 = \pi$.
- $\pi x = 0 \iff (I + \alpha)^kx = 0$ for some $k > 0$.
A convenient way is to choose $\pi$ again using the Jordan normal form, but this is not necessary.
Then perform the iteration with \[ x_i \mapsto \pi \beta + (\epsilon - 1) \pi \alpha x_i \] The limiting object (if it exists) will belong to $\{ x = \pi x\}$ and solve \[ \pi (I + (1-\epsilon)\alpha) x = \pi \beta.\]
Example
Consider trying to solve the discrete Poisson equation $ \Delta f = g$ on a rectangular periodic domain. In matrix form (assuming one spatial dimension for now, but the argument easily generalizes) the Laplace operator is given by the symmetric matrix \begin{equation} \Delta = I + \alpha = \begin{pmatrix} 1 & -1/2 & 0 & 0 &\ldots & 0 & -1/2 \newline -1/2 & 1 & -1/2 & 0 & \ldots & 0 & 0 \newline 0 & -1/2 & 1 & -1/2 \newline 0 & & & \ddots \newline \vdots & & & & \ddots \newline 0 & 0 & \ldots & 0 & -1/2 & 1 & -1/2 \newline -1/2 & 0 & \ldots & 0 & 0 & -1/2 & 1 \end{pmatrix} \end{equation}
Here $\alpha$ is symmetric and real, and so is diagonlizable. The eigenvalues of $\alpha$ turns out to always lie between $-1$ and $1$. The eigenspace with eigenvalue $-1$ is one dimensional and consists of the vector of all $1$s.
Applying the method described in the previous section, given an arbitrary function $g$, its projection $\pi g$ is nothing more that $g$ subtracting its mean, and the method solves for the zero-mean function $f$ whose Laplacian is the mean-free part of $g$.
The regularization using $\epsilon$ can be interpreted as essentially trying to solve instead of $\Delta f = g$, solving $(\Delta - \epsilon)f_\epsilon = g$. Note that $\Delta-\epsilon$ is in fact invertible on our periodic domain with compact inverse. (Though the approach using Neumann series would not work because $\Delta$ is not a bounded operator.)
- Numerical computation of the inverse matrix is very unstable; pretty much any other method (Gaussian elimination etc.) is better. ^