Bessaga's converse to the contraction mapping theorem

2014-05-07 Last updated on 2025-04-21 5 min read expository

In preparing some lecture notes for the implicit function theorem, I took a look at Schechter's delightfully comprehensive Handbook of Analysis and its Foundations¹, and I learned something new about the Banach fixed point theorem. To quote Schechter:

... although Banach's theorem is quite easy to prove, a longer proof cannot yield stronger results.

I will write a little bit here about a "converse" to the Banach theorem due to Bessaga, which uses a little bit of help from the Axiom of Choice.

Banach's Theorem

Let us start by recalling the statement and proof of Banach's fixed point theorem.

Theorem [Banach fixed point]

Let $(X,d)$ be a complete non-empty metric space. Let $f:X \to X$ be a strict contraction: that is to say, suppose there exists some $\alpha \in [0,1)$ such that $ d(f(x_1),f(x_2)) \leq \alpha d(x_1,x_2)$ always hold. Then $ f$ has a unique fixed point $\xi$.

Proof:

Let $x_0\in X$ be arbitrary and fixed. Consider the sequence generated by $x_n = f(x_{n-1})$ for $n \geq 1$. Define $d_0 = d(x_0, x_1)$. The sequence $x_0, x_1, \ldots$ is Cauchy, as, by triangle inequality, for any $m, n > 0$ we have \[ d(x_m, x_{m+n}) \leq d(x_m, x_{m+1}) + d(x_{m+1}, x_{m+2}) + \cdots + d(x_{m+n-1}, x_{m+n}) \] which by the contraction mapping assumption can be bounded by \[ \leq d_0 ( \alpha^m + \alpha^{m+1} + \cdots + \alpha^{m+n-1} \leq d_0 \cdot \frac{\alpha^m}{1 - \alpha}.\] The completeness of $X$ implies then the sequence converges to some element $\xi$. As \[ d(\xi, f(\xi)) \leq d(\xi, x_m) + d(x_m, f(\xi)) \leq d(\xi, x_m) + \alpha d(x_{m-1}, \xi) \] the convergence of the sequence implies $\xi$ is a fixed point. This fixed point is unique as the presence of two distinct fixed points $\xi, \zeta$ would give $d(f(\xi), f(\zeta)) = d(\xi, \zeta)$ contradicting the strict contraction property.

Note that one immediate corollary is that: if $f$ is a contraction mapping, then each iterate $f^{(k)} = \underbrace{f\circ f\circ \cdots \circ f}_{k \text{ times}}$ possesses exactly one (the same) fixed point $\xi$.

Bessaga's converse

Theorem [Bessaga]

Let $X$ be a set and $\xi\in X$. Suppose $f:X\to X$ is a function such that for any $k$ we have \[ f^{(k)}(x) = x \iff x = \xi \] then there exists a complete metric $d$ on $X$ such that $f$ is a strict contraction.

Proof [sketch]:

First we define an equivalence relation on $X$. We say $x\sim y$ if there exists positive integers $p,q$ such that $ f^{(p)}(x) = f^{(q)}(y)$. If $x\sim y$ we define \[ \rho(x,y) = \min \{p+q\mid f^{(p)}(x) = f^{(q)}(y)\} \] and \[ x\wedge y = f^{(p)}(x) \] where $p$ is the value that attains $\rho(x,y)$.

We also define \[ \sigma(x,y) = \rho(x,x\wedge y) - \rho(y,x\wedge y).\] Observe that $\rho$ is symmetric in its arguments and $\sigma$ antisymmetric.

(The idea so far: we can make $X$ a directed graph by making edges go from $x$ to $f(x)$. That $\xi$ is the only fixed point means that $X$ has no cycles, so $X$ now becomes a direct forest (like a tree, but not necessarily connected). The equivalence relation $\sim$ expresses whether two nodes are on the same connected component. When they are, $\rho$ measures the distance between them on the underlying tree, and $\sigma$ measures the difference between their distances to their most recent common ancestor.)

Now, by the axiom of choice, there exists a choice function that chooses for each equivalence class of $X/\sim$ a representative, this extends to a function $c:X\to X$. From this we can define a function $\lambda: X \to \mathbb{Z}$ by \[ \lambda(x) = \sigma(c(x),x). \]

(Life would be extremely simple if each of the connected component in our forest is rooted; unfortunately this is generally not the case. So we have to appeal to the axiom of choice to select a node. What we do below is compute the distance between two nodes on the same tree by using the selected node as a reference.)

Now we are in a position to define our distance function $d$. We do it in several steps.

If $x\sim y$, we define \[ d(x,y) = 2^{-\lambda(x)} + 2^{-\lambda(y)} - 2\cdot 2^{-\lambda(x\wedge y)}\] where the empty sum yields 0.
If $x\not\sim \xi$, we define \[ d(x,\xi) = 2^{- \lambda(x)}. \]
If $x\not\sim y$ and neither $x,y$ is $\xi$, we define \[ d(x,y) = d(x,\xi) + d(y,\xi). \]

This definition is obviously symmetric and non-negative. And it is easy to check that $d(x,y) = 0 \implies x\sim y$ and $x = y = x\wedge y$. Triangle inequality involves a little bit more work, but most cases are immediately obvious except when $x\sim y\sim z$. Here we need to check that \[ 2\cdot 2^{-\lambda(y)} - 2 \cdot 2^{-\lambda(x\wedge y)} - 2\cdot 2^{-\lambda(y\wedge z)} \geq - 2 \cdot 2^{-\lambda(x\wedge z)}\] which we do right now. Suppose $f^{(p)}(x) = f^{(q_1)}(y)$ and $f^{(q_2)}(y) = f^{(r)}(z)$. WLOG we can take $q_1 \geq q_2$. Then we have that $f^{(p)}(x) = f^{(r + q_1 - q_2)}(z)$. This shows that $\lambda(x\wedge z) \leq \min( \lambda(x\wedge y), \lambda(y\wedge z))$. And this proves the inequality above.

It is easy to see that any Cauchy sequence either is eventually constant, or must converge to $\xi$: if $x\neq y$ we have that \[ d(x,y) \geq 2^{- \min(\lambda(x),\lambda(y)) - 1} \geq \frac14 (d(x,\xi) + d(y,\xi))\] and this shows that $d$ is a complete metric. Now, it remains to verify that $f$ is a contraction. Noting that $ \lambda(f(x)) = \lambda(x) + 1$ and $ f(x) \wedge f(y) = x\wedge y$ we see easily that $ f$ is a contraction with Lipschitz constant $\frac12$.

A copy of Schechter's book can be found on his website ^{^}

real analysis math.CA

Willie WY Wong

Associate Professor

My research interests include partial differential equations, geometric analysis, fluid dynamics, and general relativity.

Bessaga's converse to the contraction mapping theorem

Banach's Theorem

Bessaga's converse

Willie WY Wong

Associate Professor

Related