An Elementary Proof of Young's Inequality

Typically Young's inequality is stated for convolutions. We can however regard it as a special case of the following inequality concerning integral kernels.

Theorem    [Young's inequality for kernels]
Let $k(x,y)$ be a measurable function on $\mathbb{R}^{m} \times \mathbb{R}^n$, suppose further that, for some $r \in [1,\infty]$, there exists constants $C_x, C_y$ such that \begin{align} \sup_{x\in \mathbb{R}^m} \| k(x,\cdot) \|_{L^r(\mathbb{R}^n)} & \leq C_x, \quad \text{and} \newline \sup_{y\in \mathbb{R}^n} \| k(\cdot, y) \|_{L^r(\mathbb{R}^m)} & \leq C_y. \end{align} Then for every $p,q\in [1,\infty]$ satisfying \begin{equation}\label{eq:3} 1 + \frac1q = \frac1p + \frac1r \end{equation} we have that \begin{equation} \Bigl\| \int_{\mathbb{R}^{n}} k(\cdot,y) f(y) ~\mathrm{d}y \Bigr\|_{L^q(\mathbb{R}^m)} \leq C_y^{\frac{r}q} C_x^{1 - \frac{r}q} \|f\|_{L^p(\mathbb{R}^n)} \end{equation} for every $f \in L^p(\mathbb{R}^n)$.

Setting $k(x,y) = g(x-y)$ immediately yields the classical convolution inequality:

Corollary    [Young's inequality for convolution]
If $f\in L^p(\mathbb{R}^d)$ and $g\in L^r(\mathbb{R}^d)$, then $f\star g \in L^q(\mathbb{R}^d)$ whenever $p,q,r\in [1,\infty]$ satisfies the relation \eqref{eq:3}.

Proof by interpolation

The way that I usually teach Young's inequality is via complex interpolation. First note that for \eqref{eq:3} to hold with $p,q,r\in [1,\infty]$, necessarily we have $p\in [1, r']$ and $q\in [r,\infty]$, where $r'$ is the Hölder conjugate \[ \frac{1}{r'} = 1 - \frac{1}{r} .\] For convenience, denote by $T$ the operator $f \mapsto \int_{\mathbb{R}^n} k(\cdot,y) f(y) ~\mathrm{d}y$. The endpoints of the estimates are easy to obtain:

  1. When $p = r'$, then we must have $q = \infty$. In this case we want to estimate $\sup |Tf|$. This is very straightforward by Minkowski's inequality and Hölder's inequality \[ \begin{aligned} |Tf(x)| & \leq \int_{\mathbb{R}^n} |k(x,y)| |f(y)| ~\mathrm{d}y \newline & \leq \| k(x,\cdot) \|_{L^{r}(\mathbb{R}^n)} \|f\|_{L^{r'}(\mathbb{R}^n)} \end{aligned} \] Taking the supremum on both sides we have the operator norm bound \[ \|T \|_{L^{r'} \to L^\infty} \leq C_x.\]
  2. When $p = 1$, then we must have $q = r$. Again by Minkowski's inequality we have \[ \begin{aligned} \| Tf \|_{L^r(\mathbb{R}^m)} & \leq \int_{\mathbb{R}^n} \|k(\cdot,y) \|_{L^r(\mathbb{R}^m)} |f(y)| ~\mathrm{d}y \newline & \leq C_y \|f\|_{L^1(\mathbb{R}^n)}. \end{aligned} \] And so we have the operator norm bound \[ \|T \|_{L^{1} \to L^r} \leq C_y.\]

So by the Riesz-Thorin(-Stein) Theorem (which can be proven using Complex Interpolation), for any $p,q,r$ satisfying \eqref{eq:3}, we have

\[ \| T\|_{L^p \to L^q} \leq C_x^{1 - \frac{r}{q}} C_y^{\frac{r}{q}} .\]

(Here we used $\theta \in [0,1]$ with $p^{-1} = 1 - \theta r^{-1}$, and $q^{-1} = (1-\theta) r^{-1}$.)

Elementary proof

I learned about the following proof through a post by Daniele Tampieri on MathOverflow, the original proof concerns the convolution operator and was due to Besov, Il’in, and Nikol’skiĭ. The proof however easily generalizes (as shown below) to the integral kernel case.

The proof is based on the following decomposition \begin{equation} f(y) k(x,y) = \left[ f(y)^p k(x,y)^r\right]^{1/q} f(y)^{1 - p/q} k(x,y)^{1 - r/q}. \end{equation} We integrate both sides in $y$, and apply Hölder's inequality to the right. Here we use that as a consequence of \eqref{eq:3}, the following identity holds for the exponents \begin{equation} q^{-1} + \left( \frac{p}{1 - p/q} \right)^{-1} + \left( \frac{r}{1 - r/q} \right)^{-1} = \frac{1}{q} + \frac{1}{p} - \frac{1}{q} + \frac{1}{r} - \frac{1}{q} = 1. \end{equation} Therefore we obtain \begin{equation} \int_{\mathbb{R}^n} k(x,y) f(y) ~\mathrm{d}y \leq \Bigl( \int_{\mathbb{R}^n} k(x,y)^r f(y)^p ~\mathrm{d}y \Bigr)^{\frac{1}{q}} \cdot \| f\|_{L^p(\mathbb{R}^n)}^{1 - \frac{p}{q}} \| k(x, \cdot) \|_{L^r(\mathbb{R}^n)}^{1 - \frac{r}{q}}. \end{equation} Now take the $L^q$ norm of both sides. On the right, the middle factor is independent of $x$ and comes straight out. We put the final factor in $L^\infty(\mathbb{R}^m)$ and the first factor in $L^q(\mathbb{R}^m)$. This gives \begin{multline}\label{eq:almost} \Bigl \| \int_{\mathbb{R}^n} k(x,y) f(y) ~\mathrm{d}y \Bigr\|_{L^q(\mathbb{R}^m)} \leq \newline \Bigl( \int_{\mathbb{R}^m} \int_{\mathbb{R}^n} k(x,y)^r f(y)^p ~\mathrm{d}y ~\mathrm{d}x \Bigr)^{\frac{1}{q}} \cdot \| f\|_{L^p(\mathbb{R}^n)}^{1 - \frac{p}{q}} \sup_{x\in \mathbb{R}^m} \| k(x, \cdot) \|_{L^r(\mathbb{R}^n)}^{1 - \frac{r}{q}}. \qquad \end{multline} For the double integral on the right we apply Fubini and swap the order of integration, which by Minkowski's and Hölder's inequalities implies \[ \int_{\mathbb{R}^n} \int_{\mathbb{R}^m} k(x,y)^r f(y)^p ~\mathrm{d}x ~\mathrm{d}y \leq \sup_{y\in \mathbb{R}^n} \| k(\cdot,y)\|_{L^r(\mathbb{R}^m)}^r \cdot \|f\|_{L^p(\mathbb{R}^n)}^p. \] Plugging this into \eqref{eq:almost} we conclude \[ \Bigl \| \int_{\mathbb{R}^n} k(x,y) f(y) ~\mathrm{d}y \Bigr\|_{L^q(\mathbb{R}^m)} \leq \| f\|_{L^p(\mathbb{R}^n)} C_x^{1-r/q} C_y^{r/q} \] exactly as claimed.

Some final remarks

Theorem 1 can be generalized to

Theorem    [Generalized Young's inequality for integral kernels]
Let $k(x,y)$ be a measurable function on $\mathbb{R}^m \times \mathbb{R}^n$, and suppose that, for some $1 \leq r \leq s \leq \infty$, the function $k \in L^s_x L^r_y \cap L^s_y L^r_x$. Then for $p\in [s',r']$ and $q\in [r,s]$ satisfying \begin{equation}\label{eq:cond} \frac{1}{p} + \frac{1}{s} + \frac{1}{r} = 1 + \frac{1}{q} \end{equation} we have that \begin{equation}\label{eq:operator} \Bigl\| \int_{\mathbb{R}^n} k(x,y) f(y) ~\mathrm{d}y \Bigr\|_{L^q(\mathbb{R}^m)} \leq \| k\|_{L^s_x L^r_y}^{1-\theta} \|k\|_{L^s_y L^r_x}^{\theta} \|f\|_{L^p} \end{equation} where \begin{equation} \theta = \frac{\frac1r - \frac1q}{\frac1r - \frac1s}. \end{equation}
Examining the proof in detail, we see that it only uses Hölder's inequality, Minkowski's inequality, and Fubini's theorem. All three of these hold for general measure spaces: we can replace $\mathbb{R}^m$ and $\mathbb{R}^n$ by two $\sigma$-finite measure spaces $(\Omega, \mu)$ and $(\Omega',\mu')$. The only thing we lose in this generality is (obviously) the interpretation that specifies $k(x,y) = g(x-y)$ which allows the interpretation as an inequality concerning convolutions.
The assumption that $r \leq s$ is convenient but not essential. Notice that as a consequence of Minkowski's inequality, $L^r_x L^s_y$ embeds into $L^s_y L^r_x$, and so an analogous result holds also with $r$ and $s$ swapped.

Both the interpolation and elementary proofs can be modified easily to cover these cases.

Now, the inequality \eqref{eq:operator} and the condition \eqref{eq:cond} can be written in dual/bilinear formulation:

Fix $1 \leq r \leq s \leq \infty$, and $\theta \in [0,1]$. Let \begin{gather} \frac{1}{p} = \frac{1-\theta}{s'} + \frac{\theta}{r'}, \newline \frac{1}{q} = \frac{\theta}{s'} + \frac{1-\theta}{r'}. \end{gather} Then if $k(x,y) \in L^s_x L^r_y \cap L^s_y L^r_x$, and $f(y)\in L^p_y$ and $g(x) \in L^q_x$, we have \begin{equation}\label{eq:bilinear} \int k(x,y) f(y) g(x) ~\mathrm{d}x ~\mathrm{d}y \leq \| k\|_{L^s_x L^r_y}^{1-\theta} \|k\|_{L^s_y L^r_x}^\theta \|f\|_{L^p_y} \|g\|_{L^q_x} . \end{equation}

In this formulation, the result can be proven even more simply than the arguments given before.


First, by doing a rescaling in the $y$ variables, we can assume that \[ \|k\|_{L^s_x L^r_y} = \|k\|_{L^s_y L^r_x} = \|k \|_{\cap} \] where the $\cap$ norm denotes the intersection norm (max) of the two mixed norms. Without loss of generality we can further assume that $\|f\|_{L^p_y} = 1 = \|g\|_{L^q_x}$.

Therefore it suffices to show that in this setting, the product $f(y) g(x)$ belongs to the dual space of the intersection space, or (sufficiently) that relative to the sum space norm \begin{equation}\label{eq:desiredsum} \| f(y) g(x) \|_{L^{s'}_x L^{r'}_y + L^{s'}_y L^{r'}_x} \leq 1. \end{equation}

Now observe that by assumption \begin{equation} 1 = (1-\theta) \frac{p}{s'} + \theta \frac{p}{r'} = (1-\theta) \frac{q}{r'} + \theta \frac{q}{s'} \end{equation} So we can write \[ f(x) g(y) = \left[ f(x)^{\frac{p}{s'}} g(y)^{\frac{q}{r'}} \right]^{1-\theta} \left[ f(x)^{\frac{p}{r'}} g(y)^{\frac{q}{s'}} \right]^{\theta} \] By Young's inequality for products, this means \begin{equation} f(x) g(y) \leq (1-\theta) \underbrace{\left[ f(x)^{\frac{p}{s'}} g(y)^{\frac{q}{r'}} \right]}_{\in L^{s'}_x L^{r'}_y} + \theta \underbrace{\left[ f(x)^{\frac{p}{r'}} g(y)^{\frac{q}{s'}} \right]}_{\in L^{s'}_y L^{r'}_{x}}. \end{equation} and the desired bound \eqref{eq:desiredsum} follows.

In some ways, this last proof is achieved by "doing real interpolation (as opposed to the complex interpolation of Riesz-Thorin) by hand."

The form of \eqref{eq:bilinear} leads to the natural question: does $k$ in fact live within the space $L^{p'}_y L^{q'}_x$? The answer is easily seen to be no.

Let $h\in L^1(\mathbb{R})$. Define $k(x,y) = h(x-y)$ on $\mathbb{R}\times\mathbb{R}$. Then obviously we have $h \in L^\infty_x L^1_y \cap L^\infty_y L^1_x$. But $h \not\in L^2_y L^2_x$.
Food for thought  
But we know that any function $F \in L^2(\mathbb{R}^2)$ can be approximated by simple functions, and hence also by finite linear combinations of functions of the form $f(x) g(y)$ with $f,g\in L^2$. So what gives? (Left as exercise for the reader.)
Willie WY Wong
Associate Professor

My research interests include partial differential equations, geometric analysis, fluid dynamics, and general relativity.