In preparing some lecture notes for the implicit function theorem, I took a look at Schechter's delightfully comprehensive Handbook of Analysis and its Foundations1, and I learned something new about the Banach fixed point theorem. To quote Schechter:
... although Banach's theorem is quite easy to prove, a longer proof cannot yield stronger results.
I will write a little bit here about a "converse" to the Banach theorem due to Bessaga, which uses a little bit of help from the Axiom of Choice.
Banach's Theorem
Let us start by recalling the statement and proof of Banach's fixed point theorem.
Note that one immediate corollary is that: if $f$ is a contraction mapping, then each iterate $f^{(k)} = \underbrace{f\circ f\circ \cdots \circ f}_{k \text{ times}}$ possesses exactly one (the same) fixed point $\xi$.
Bessaga's converse
First we define an equivalence relation on $X$. We say $x\sim y$ if there exists positive integers $p,q$ such that $ f^{(p)}(x) = f^{(q)}(y)$. If $x\sim y$ we define \[ \rho(x,y) = \min \{p+q\mid f^{(p)}(x) = f^{(q)}(y)\} \] and \[ x\wedge y = f^{(p)}(x) \] where $p$ is the value that attains $\rho(x,y)$.
We also define \[ \sigma(x,y) = \rho(x,x\wedge y) - \rho(y,x\wedge y).\] Observe that $\rho$ is symmetric in its arguments and $\sigma$ antisymmetric.
(The idea so far: we can make $X$ a directed graph by making edges go from $x$ to $f(x)$. That $\xi$ is the only fixed point means that $X$ has no cycles, so $X$ now becomes a direct forest (like a tree, but not necessarily connected). The equivalence relation $\sim$ expresses whether two nodes are on the same connected component. When they are, $\rho$ measures the distance between them on the underlying tree, and $\sigma$ measures the difference between their distances to their most recent common ancestor.)
Now, by the axiom of choice, there exists a choice function that chooses for each equivalence class of $X/\sim$ a representative, this extends to a function $c:X\to X$. From this we can define a function $\lambda: X \to \mathbb{Z}$ by \[ \lambda(x) = \sigma(c(x),x). \]
(Life would be extremely simple if each of the connected component in our forest is rooted; unfortunately this is generally not the case. So we have to appeal to the axiom of choice to select a node. What we do below is compute the distance between two nodes on the same tree by using the selected node as a reference.)
Now we are in a position to define our distance function $d$. We do it in several steps.
- If $x\sim y$, we define \[ d(x,y) = 2^{-\lambda(x)} + 2^{-\lambda(y)} - 2\cdot 2^{-\lambda(x\wedge y)}\] where the empty sum yields 0.
- If $x\not\sim \xi$, we define \[ d(x,\xi) = 2^{- \lambda(x)}. \]
- If $x\not\sim y$ and neither $x,y$ is $\xi$, we define \[ d(x,y) = d(x,\xi) + d(y,\xi). \]
This definition is obviously symmetric and non-negative. And it is easy to check that $d(x,y) = 0 \implies x\sim y$ and $x = y = x\wedge y$. Triangle inequality involves a little bit more work, but most cases are immediately obvious except when $x\sim y\sim z$. Here we need to check that \[ 2\cdot 2^{-\lambda(y)} - 2 \cdot 2^{-\lambda(x\wedge y)} - 2\cdot 2^{-\lambda(y\wedge z)} \geq - 2 \cdot 2^{-\lambda(x\wedge z)}\] which we do right now. Suppose $f^{(p)}(x) = f^{(q_1)}(y)$ and $f^{(q_2)}(y) = f^{(r)}(z)$. WLOG we can take $q_1 \geq q_2$. Then we have that $f^{(p)}(x) = f^{(r + q_1 - q_2)}(z)$. This shows that $\lambda(x\wedge z) \leq \min( \lambda(x\wedge y), \lambda(y\wedge z))$. And this proves the inequality above.
It is easy to see that any Cauchy sequence either is eventually constant, or must converge to $\xi$: if $x\neq y$ we have that \[ d(x,y) \geq 2^{- \min(\lambda(x),\lambda(y)) - 1} \geq \frac14 (d(x,\xi) + d(y,\xi))\] and this shows that $d$ is a complete metric. Now, it remains to verify that $f$ is a contraction. Noting that $ \lambda(f(x)) = \lambda(x) + 1$ and $ f(x) \wedge f(y) = x\wedge y$ we see easily that $ f$ is a contraction with Lipschitz constant $\frac12$.
- A copy of Schechter's book can be found on his website ^