Colloquium Notes: Craig Evans

2020-02-27 Last updated on 2025-04-21 10 min read notes from talks

L. Craig Evans is a professor at UC Berkeley. For many students he is best known for his textbook on partial differential equations. I won't try to list all his research interests and accomplishments since there's no way I can do him justice.

Note: I've augmented Evans' talk with some additional details.

Unlinearizing PDEs

Just as an appetizer: start with the linear elliptic equation \begin{equation} \tag{L}\label{eq:L1} \begin{cases} -\triangle u + c(x) u = 0 \qquad \text{on } U \subseteq \mathbb{R}^n \newline u > 0 \end{cases} \end{equation} We can unlinearize it by making a log change of variables. \[ v := \ln u \implies \begin{cases} Dv = \frac{Du}{u} \newline \triangle v = \frac{\triangle u}{u} - \frac{|Du|^2}{u^2} \end{cases} \] From which we can derive a nonlinear elliptic equation \begin{equation} \tag{N}\label{eq:N1} \triangle v = - |Dv|^2 + c(x) \end{equation}

Remark: notice that the change of variables is only possible because we assumed the solution $u$ is semi-bounded.

Claim: \eqref{eq:N1} is better than \eqref{eq:L1}.

From the analyst's perspective, this means that the equation \eqref{eq:N1} allows us to prove good a priori estimates. A key one is the following: \[ \forall V\Subset U \quad \exists C= C(V, c, Dc) \quad \sup_V |D v|^2 \leq C. \] Sketch of proof: Let $w = |Dv|^2$. Observe that \eqref{eq:N1} gives $w = - \triangle v + c(x)$ also. A direct computation shows \[ \triangle w = 2 |D^2 v|^2 + 2 Dv \cdot D \triangle v = 2 |D^2 v|^2 - 2 Dv \cdot D w + 2 Dv \cdot D c. \] Suppose $w$ attains a maximum at an interior point $p$, this would require then $Dw = 0$ and $\triangle w \leq 0$. So we have \[ 2 |D^2 v|^2(p) \leq -2 Dv(p) \cdot Dc(p). \] Observe that \[ |w - c| \leq |\triangle v| \lesssim |D^2 v| \] so we have \[ (w(p) - c(p))^2 \leq \sqrt{w(p)} |Dc(p)|. \] One easily sees that this differential inequality can only be satisfied by $w(p)$ bounded universally by some constant depending on $c(p)$ and $|Dc(p)|$. In the case $w$ doesn't attain a maximum at an interior point $p$, we can take a cut-off function $\zeta$ that equals $1$ on $V$. And perform the same operation with $w$ replaced by $\zeta^4 w$. (See section 6.4.3 in second edition of Evans' PDE book for details on the cut-off argument.)

This estimate is the building block of the proof of Harnack's inequality, and the argument also gives useful applications in large deviation estimates, and in problems involving pattern formation.

Riccati equations

Let's switch tack. Consider now the linear second order ODE \begin{equation} \tag{L}\label{eq:L2} - \ddot{u} + c(t) u = 0 \end{equation} on an interval $I \subseteq \mathbb{R}$ and assume we are dealing with a positive solution. Take $v:= \ln u$ as before and we find \[ \ddot{v} = - (\dot{v})^2 + c(t) \] as before. Now let us define the variable $q = \dot{v}$. We see then \[ \dot{q} = - q^2 + c(t) \] which is a Riccati ODE. Again we will want to use the existence of semi-bounded solution on $I$ as an input that can be used to get a priori estimates.

Might as well generalize immediately. Now let's look at matrix valued Riccati systems. Consider still $I\subseteq \mathbb{R}$ an interval and $U$ some function on $I$ taking values in square, invertible matrices. Suppose further that $U$ satisfies \begin{equation} \tag{L} \label{eq:L3} - \ddot{U} + C(t) U = 0 \end{equation} where $C(t)$ is some given matrix valued function. We can perform the same trick as before, but now we must define \[ Q(t) := \dot{U}(t) U^{-1}(t) \] and now we arrive at the Riccati system \begin{equation} \tag{R} \label{eq:R1} \dot{Q} = - Q^2 + C(t). \end{equation} For much of what we will discuss later we can largely forget about the underlying $U$ and focus on $Q$, and we shall assume that $Q$ and $C$ take values in symmetric matrices.

Remark: a lot of the theory here was developed by Reid. A very good presentation of the material here can be found in Hartman's ODE book, Chapter XI, especially the Appendix on "disconjugate systems".

Let $C(t)$ be as in \eqref{eq:L3} or \eqref{eq:R1}. Consider now a vector valued solution $\vec{u}$ on an interval $[a,b] \subseteq I$ to the boundary value problem \[ \begin{cases} - \ddot{\vec{u}} + C(t) \vec{u} = 0\newline \vec{u}(a) = \vec{u}(b) = \vec{0} \end{cases} \] We say that the equation \eqref{eq:L3} is disconjugate on $I$ if for any subinterval $[a,b]\subseteq I$ the only solution to the above boundary value problem is $\vec{u} \equiv 0$.

Remark: one can interpret the boundary value problem as saying something akin to having a geodesic flow with fixed end points; nonuniqueness in this setting corresponds to conjugate points. Hence the terminology.

Notice that the equation $-\ddot{\vec{u}} + C(t) \vec{u} = 0$ is linear. So the disconjugate condition is the same as saying that solutions (should they exist) to the (Dirichlet) boundary value problem for this equation are unique. Fix $\vec{u}_a$ and $\vec{v}_a$, and consider the initial value problem for the linear ODE with data $\vec{u}(a) = \vec{u}_a$ and $\dot{\vec{u}}(a) = \vec{v}_a$. Provided $C(t)$ is suitably regular, the solution exists for all time. Consider now the mapping \[ \vec{v}_a \mapsto \vec{u}(b) \] By construction this is a linear mapping between vector spaces of the same finite dimension. The uniqueness of solutions to the Dirichlet problem implies that this mapping is injective, and hence the mapping must also be surjective. (Sort of a finite dimensional Fredholm alternative.) Therefore we have that the Dirichlet problem is in fact uniquely solvable for this equation.

Now, suppose that \eqref{eq:L3} is disconjugate on $\mathbb{R}$. Consider the initial value problem for \eqref{eq:L3} with initial data $U(0) = 0$ and $\dot{U}(0) = I$, the identity matrix. The disconjugacy implies that $\det U(t) \neq 0$ for all $t > 0$: for if not, there exists some vector $\vec{v}$ and $t_0$ such that $U(t_0) \vec{v} = 0$. But Then $U(t) \vec{v}$ is a solution to $- \ddot{\vec{u}} + C(t) \vec{u} = 0$ that vanishes both at $t = 0$ and $t = t_0$, contradicting the disconjugacy.

Observe also that for this $U$, the function $U^T\dot{U}$ is self-adjoint: \[ \frac{d}{dt} (U^T \dot{U} - \dot{U}^T U) = U^T \ddot{U} - \ddot{U}^T U = 0 \] provided that $C(t)$ is symmetric. And we also have $U^T \dot{U} = 0$ at $t = 0$.

Define next using the variation of parameters argument for $t \geq 1$ \[ \hat{U}(t) := U(t) + \int_1^t U(t) U^{-1}(s) (U^T)^{-1}(s) ~\mathrm{d} s. \] Taking its second derivative we get \[ \ddot{\hat{U}}(t) = \ddot{U}(t) + \int_1^t \ddot{U}(t) U^{-1}(s) (U^T)^{-1}(s) ~\mathrm{d}s + \dot{U}(t) U^{-1}(t) U^{-T}(t) - (U^T)^{-1}(t) \dot{U}^T(t) (U^T)^{-1}(t). \] The final two terms cancel each other by virtue of $U^T \dot{U}$ being self-adjoint. And so we have that $\hat{U}$ also solves \eqref{eq:L3}, and $\hat{U}^T \dot{\hat{U}}$ is also self-adjoint.

Since $\int_1^t U^{-1}(s) (U^T)^{-1}(s) ~\mathrm{d}s$ is positive semidefinite, we have that $\hat{U}$ is also invertible. One checks by a straightforward computation that \[ \left( I + \int_1^t U^{-1}(s) (U^T)^{-1}(s) ~\mathrm{d}s \right) \left( I - \int_1^t \hat{U}^{-1}(s) (\hat{U}^T)^{-1}(s) ~\mathrm{d}s\right) = I.\]

So this implies that $\int_1^t \hat{U}^{-1}(s) (\hat{U}^T)^{-1}(s) ~\mathrm{d}s$, as $t$ increases, is an increasing sequence of positive semi-definite matrices bounded above by the identity. Therefore by a lemma of F. Riesz converges to a limit, which we should denote by $M_{\infty}$.

Therefore we can define for $t \geq 1$ the new solution to \eqref{eq:L3} given by \[ \breve{U}_+(t) = \int_t^{\infty} \hat{U}(t) \hat{U}^{-1}(s) (\hat{U}^T)^{-1}(s) ~\mathrm{d}s \] This we can rewrite as \[ \breve{U}_+(t) = \hat{U}(t) M_\infty - \int_1^t \hat{U}(t) \hat{U}^{-1}(s) (\hat{U}^T)^{-1}(s) ~\mathrm{d}s \] and we can compute that \[ \left(M_\infty - \int_1^t \hat{U}^{-1}(s) (\hat{U}^T)^{-1}(s) ~\mathrm{d}s \right) \left( M_\infty^{-1} + \int_1^t \breve{U}_+^{-1}(s) (\breve{U}_+^T)^{-1}(s) ~\mathrm{d}s \right) = I. \] The terms in the first parentheses converges to zero as $t \to \infty$, therefore necessarily \[ \int_1^t \breve{U}_+^{-1}(s) (\breve{U}_+^T)^{-1}(s) ~\mathrm{d}s \] blows up as $t \to \infty$ in the sense that its inverse converges to 0. We call solutions with this blow-up property principal.

It turns out that all principal solutions are the same up to right-multiplication by an invertible matrix.

Remark: It may be instructive to think about what these solutions look like in the scalar case. When $c(t) \equiv 0$, a principal solution is $u(t) \equiv 1$; when $c(t) \equiv 1$, a principal solution is $u(t) \equiv e^{-t}$. When $c(t) \equiv -1$, the system is not disconjugate.

Because $\breve{U}_+$ is well-defined up to right-multiplication by an invertible matrix, we see that the object \[ Q_+ = \dot{\breve{U}}_+ \breve{U}_+^{-1} \] is well-defined for the system (and is symmetric by the fact that $\breve{U}_+^T \dot{\breve{U}}_+$ is self-adjoint).

We can similarly define $Q_-$ based on the principal solutions toward $t = -\infty$.

The following is the main interesting result of this theory:

Theorem Returning to solving \eqref{eq:R1} under the assumption that \eqref{eq:L3} is disconjugate on all of $\mathbb{R}$. Suppose $Q$ is a symmetric-matrix valued global solution. Then \[ Q_+ \leq Q \leq Q_-; \] here $\leq$ is using the partial ordering of symmetric matrices.

Hamilton-Jacobi PDE

(Remark: this part I understood less and so the notes are a lot sketchier.)

Let $H: \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}$ be a Hamiltonian $H = H(p,x)$.

The Hamiltonian system is \begin{equation} \tag{H} \begin{cases} \dot{x} = D_P H \newline \dot{p} = - D_x H \end{cases}\end{equation} Consider from now on the special case \[ H = \frac12 |p|^2 + W(x) \] where $W$ is periodic. In this special case the Hamiltonian evolution reduce to \[ \ddot{x} = - DW(x).\]

We wish to make a connection to weak KAM theory.

We can form the Lagrangian \[ L(v,x) = \frac12 |v|^2 - W(x) \] and define the action \[ A_{a,b} = \int_a^b \frac12 |\dot{x}|^2 - W(x) ~\mathrm{d}t \] which is a functional on paths $x(t)$ over $[a,b]$. The general action principle looks for stationary points of the action $A_{a,b}$ which satisfies as the Euler-Lagrange equation the Hamiltonian evolution equations.

Idea: if we look specifically for global minimizer of the action, then there should be extra structure that we can make use of.

Mather's Variational Principle Since finding a path minimizer is too difficult, let's relax from paths to radon measures. Look for Radon measure $\nu$ on $\mathbb{R}^n \times \mathbb{T}^n$ that minimizes \[ A[\nu] = \iint_{\mathbb{R}^n \times \mathbb{T}^n} \frac12 |v|^2 - W(x) ~\mathrm{d}\nu. \] Add the constraints that

$\iint v \cdot D\phi(x) ~\mathrm{d}\nu = 0$ for all $\phi \in C^1(\mathbb{T})$ (stationarity)
$\iint v ~\mathrm{d}\nu = V$ where $V$ is a fixed, given vector (the "average velocity")
$\iint ~\mathrm{d}\nu = 1$ (normalization)

This problem is a linear programming problem, so there exists a solution. (The general idea of convexifying a problem to make existence easier.)

Now consider the mapping $V \mapsto A[\nu] = \bar{L}(V)$ from the given average velocity to the action of the minimizing measure $\nu$. This we call the effective Lagrangian.

From this effective Lagrangian we can construct an effective Hamiltonian by taking its convex dual $\bar{H} = \bar{H}(P)$. Notice that the effective Lagrangian is independent of $x$, and the effective Hamiltonian is independent of $x$, and so we are really thinking about constructing action-angle coordinates in some generalized sense.

Next we can introduce the Hamilton-Jacobi equations for $u = u(x,P)$ \begin{equation} \tag{HJ} \label{eq:HJ} \frac{|D_x u|^2}{2} + W = \bar{H}(P) \end{equation}

Now, suppose we try to solve (for fixed $P$) the equation \[ \dot{x} = D_x u(x,P) \] where $u$ solves \eqref{eq:HJ}. We want this to connect to the solutions of the original Hamiltonian system.

Now consider the matrix given by the Hessian of $u$, which is by definition symmetric, and evaluate it along the curve $x$. \[ Q(t) := D^2_{xx} u(x(t), P) \] It turns out we can formally show that \[ \dot{Q} = - Q^2 - D^2 W(x) \] and it turns out that if the initial data for $x$ sits in the projection of the measure $\nu$ (the action minimizer) to the $x$ variables, then the system is disconjugate.

Question: can we figure out what is $Q_+, Q_-$, and how to use this to analyze the dynamical system?

Turns out that a lot is already known about this construction from the dynamical systems side (c.f. the notion of Green's bundle and work of M.C. Arnaud.)

math.AP math.DS math.CA

Willie WY Wong

Associate Professor

My research interests include partial differential equations, geometric analysis, fluid dynamics, and general relativity.