Newton-Cartan Gravity

Author's Note (July 2019)

The following post was first written in December of 2009 in three separate installments. I decided to clean it up a bit and consolidate the discussion into one post during the migration to my new website. The editing is significant and the presentation below has some significant differences from the original 2009 version. Though I hope the mathematical content remains largely the same.

In my previous post about the Cartan staircase, I mentioned the concept of a Riemann-Cartan (or Einstein-Cartan) geometry. There the motivation came from introducing torsion into the connection and use it to model physical phenomenon, relaxing the usual torsion-free condition on the Levi-Civita connection. In this post, we will discuss Newton-Cartan gravity, which is a geometry theory of gravitation that relaxes instead the metric compatibility of the connection. (Recall that the Levi-Civita connection is the unique metric-compatible, torsion free connection associated to a pseudo-Riemannian metric.)

Table of Contents

Newtonian gravity and the naive formulation

(Note: the material in this section is re-hashed from Section 12.1 of Misner, Thorne, and Wheeler's "Gravitation".)

Consider first Newtonian theory of gravity. The space-time is $\mathbb{R}^1\times\mathbb{R}^3$ with Galilean symmetry, and gravitational interaction is represented by the gravitation potential $\Phi(t,x)$. In Newtonian theory, the gravitational field is given by minus the gradient of the potential $\vec{F}_G = - \vec{\nabla} \Phi$ (I will put the arrows over symbols to denote the fact that they are three-dimensional vectors, and the derivative symbols should be interpreted in the sense of three-dimensional vector calculus). The force on a particle is given by the product of the gravitational field and the gravitational charge of the particle $m_G\vec{F}_G$. By Newton's second law, the force is also equal to the product of the inertial mass and the acceleration of the particle $m_I \vec{a}$. Now, by the principle of equivalence (or the observation that the gravitational charge is equal to the inertial mass), we have that the gravitational field is equal to the acceleration of the particle.

Now consider a particle traveling in the gravitational field in free fall. Write its trajectory in $\mathbb{R}^3$ as $\vec{\xi}(t) = (\xi_1(t), \xi_2(t), \xi_3(t))$. Lifting to the space-time the world line of the particle is given by $(t,\vec{\xi}(t))$. (For people familiar with General Relativity already: in GR the world-line is usually given as a geodesic with unit speed. Under the 3+1 split in Newtonian theory, "proper time" is not defined, so the natural parametrization is by the global/invariant time.) The velocity vector in the space-time is $(1,\dot{\vec{\xi}}(t))$ and the acceleration vector is $(0,\ddot{\vec{\xi}}(t))$. The Newtonian equation of motion then is described by \begin{equation} \frac{d^2}{dt^2} (t, \vec{\xi}(t)) + (0, \vec{\nabla}\Phi(t, \vec{\xi}(t))) = 0. \end{equation} Now, observe that if we do an affine change of variables $t \to t(s)$ (affine means here $d^2t/ds^2 = 0$), and notice that the chain rule gives $d/ds = dt/ds \cdot d/dt$, (and by abuse of notation we write $\vec{\xi}(s) = \vec{\xi}(t(s))$) \[ \tag{1'} \label{eq:oneprime} \frac{d^2}{ds^2}\left(t(s), \vec{\xi}(s)\right) + \left(0, \vec{\nabla}\Phi(t(s),\vec{\xi}(s))\right) \left(\frac{dt}{ds}\right)^2 = 0. \]

Suppose we want to now "geometrize" \eqref{eq:oneprime}, what's the right equation to compare the above to? The parameter $s$ here in the equation is an arbitrary affine (re-)parametrization (relative to global time $t$) of the world-line of a particle. The physical situation--a particle in free-fall--under the "geometrization" procedure of the principle of equivalence, should be identified with a geodesic in space-time. Now let $\nabla$ (N.b. without the arrow on top) denote an affine connection (torsion free) on $\mathbb{R}^1\times\mathbb{R}^3$, and $\Gamma^a_{bc}$ its corresponding Christoffel symbols in the natural coordinates, the geodesic equation in affine parametrization should read \begin{equation}\label{eq:two} \frac{d^2 \chi^a}{d\tau^2} + \Gamma^a_{bc}\frac{d \chi^b}{d\tau}\frac{d \chi^c}{d\tau} = 0. \end{equation}

By visual inspection, to identify \eqref{eq:oneprime} and \eqref{eq:two}, we need (here we take the indexing convention $0\leq a,b\leq 3, 1\leq i,j\leq 3$) \begin{equation}\label{eq:three} \Gamma^0_{ab} = \Gamma^a_{bi} = 0 \quad \text{and} \quad \Gamma^i_{00} = \frac{\partial}{\partial x^i} \Phi. \end{equation} (Recall that as the connection is assumed to be torsion free, $\Gamma^a_{bc}$ is symmetric in the $b,c$ indices.) If we impose the further compatibility condition \[\tag{3'}\label{eq:threeprime} \frac{\partial}{\partial x^j}\Gamma^i_{00} = \frac{\partial}{\partial x^i}\Gamma^j_{00} \] then \eqref{eq:three} and \eqref{eq:threeprime} together give a one-to-one (up to a global addition of a constant) identification of Newtonian gravitational potentials with a class of non-trivial affine connections.

Before we proceed, first let us calculate the curvature of this affine connection. Recall the formula for the curvature operator of a connection: \[ R(v,w)u = \nabla_v\nabla_w u - \nabla_w\nabla_v u - \nabla_{[v,w]}u \] Inserting the natural coordinate vector fields (whose commutators vanish), we have \[ R_{abc}{}^d = \partial_a\Gamma^d_{bc} - \partial_b\Gamma^d_{ac} + \Gamma^f_{bc}\Gamma_{af}^d - \Gamma^f_{ac}\Gamma_{bf}^d \] where repeated indices are assumed to be summed. By \eqref{eq:three} and \eqref{eq:threeprime}, the quadratic terms all vanish, and the only non-zero components are \begin{equation}\label{eq:four} R_{i00}{}^j = \partial^2_{ij}\Phi = - R_{0i0}{}^j$ \end{equation} which implies that the only non-zero contracted curvature is \[ \tag{4'} \label{eq:fourprime} R_{00} = \triangle \phi =: 4\pi \rho \] where $\rho$ is the mass-density. I paraphrase from Misner-Thorne-Wheeler: The equations \eqref{eq:three}, \eqref{eq:threeprime}, \eqref{eq:four}, \eqref{eq:fourprime}, together with the equation for geodesic motion are "the full content of Newtonian gravity, rewritten in geometric language".

Galilean change of coordinates

Newtonian gravity should be invariant under the Galilean transformations. These transformations consist of changes of coordinates $(t,\vec{x})\to(t',\vec{x}')$ generated by

  • Temporal translations: $t' = t + t_0$; temporal reflections $t' = - t$.
  • Spatial translations: $\vec{x}' = \vec{x} + \vec{x}_0$; spatial reflections $\vec{x}' = - \vec{x}$.
  • Spatial rotations: $\vec{x}' = A\vec{x}$ where $ A$ is a $3\times 3$ matrix satisfying the condition $AA^T = I$ (in other words $A$ is an orthogonal matrix).
  • Shear transformation: $\vec{x}' = \vec{x} + \vec{v}_0 t$.

The middle two of the above are the generators of the affine-group of symmetries on Euclidean 3-space. The first is the generator of the symmetry group of Euclidean 1-space. The one that characterizes the "Galilean aspect" is the last.

Let us stop here and think a bit about what it means to have these coordinate changes. Consider a bunch of observers just floating around in space. Each of them carries a stopwatch (since we are in Newtonian picture with global time, all the stopwatches run at the same rate), and three rulers held mutually perpendicular (we'll also assume that all the rulers are marked the same way). Now put a bunch of particles also in space, and have the observers look at the particles. Now, the observers cannot see the particles from afar. All they can sense is when a particle or another observer gets very close to them (say they are conscious of everything within a foot of them). Every time a particle comes close to an observer, he notes where (relative to his rulers) and when (relative to his stopwatch) the particle enters and leaves his sensory field. And he jots down in his notebook something like

At time 12543.3, a particle appears at coordinates (0,10,0); at time 12548.3, it leaves from coordinates (-10,0,0). Hence it is traveling with velocity vector (-2,-2,0).

Now imagine observers A and B drift close to each other, and a particle C shoots through their region. Afterward, they compare their entries. Okay, the times are different, but that can be attributed to their starting their stopwatch at different times. The duration of the particle's presence is different, but that's probably because the particle stayed closer to observer A, and hence is in his sensory field a smidge longer. The coordinate measurements are different, and that is again due in part to the A and B becoming aware of C at different times, and also due in part to the fact that A and B are not holding their rulers in the same directions. But finally, the speeds they computed are different, even after factoring in how the rulers are held — ah, they are moving relative to each other.

Now looking at the picture in space-time, when the observers A, B, and the particle C gets close, we can take it to mean that they are physically at the same event $p$ in our space-time. The velocity vector of C is then a vector in the tangent space $T_p(\mathbb{R}^1\times\mathbb{R}^3)$. The fact that A and B get different measurements corresponds to them reading off the coordinates of the velocity vector of C in two different bases of the tangent space: each observer has three vectors that corresponds to unit spatial directions, and one vector that corresponds to unit temporal direction.

The Galilean symmetry puts constraints on how these bases can differ. That there is a global time implies that the span of the spatial directions are always the same, and that the difference between the temporal directions must be in the span of the spatial directions (this is the relative velocity of the two observers). That each observer agrees on what constitutes as a unit length implies that the spatial directions are just rotations of each other (let's ignore reflections for now; they correspond to discrete, not continuous, symmetries and in the language of Gauge theory should be discarded as they are not connected to the component containing identity).

Another way of arriving at this conclusion would be to look at the infinitesimal transformations in the tangent and co-tangent spaces induced by the Galilean coordinate changes above at a fixed-point of the coordinate change.

We will call a set of bases in the tangent space that corresponds to an observer an "admissible frame". Each admissible frame has 1 time-like direction and 3 space-like directions. Given two admissible frames $\{ e_t, e_1, e_2, e_3\}$ and $\{e'_t, e'_1,e'_2,e'_3 \}$, the rules of Galilean symmetry implies that there exists a 3-by-3 orthogonal matrix $ A_{ij}$ and a 3-vector $v_i$ such that \begin{equation}\label{eq:five} e'_t = e_t + \sum_{i = 1}^3 v_ie_i~,\qquad e'_i = \sum_{j = 1}^3 A_{ij}e_j. \end{equation} A simple calculation shows that if the change from frames $e_\star \to e'_\star$ uses transformation $(A_{ij}, v_i)$, and if a change from $e'_\star \to e''_\star$ uses $(B_{ij}, w_i)$, then a change from $e_\star \to e''_\star$ goes through the transformation $((BA)_{ij}, (v + wA)_i)$ where products are matrix products: $ (BA)_{ij} = \sum_k B_{ik}A_{kj}$ and $(wA)_i = \sum_k w_kA_{ki}$.

In other words, the set of all "admissible frames" is a $\mathfrak{G}$-torsor for the group $\mathfrak{G} = \mathbb{R}^3\rtimes_\phi SO(3)$, the semi-direct product given by $\phi_A(v) = vA$.

Structure invariants

The question I want to ask now is this: consider the set of all possible quartets of non-zero tangent vectors. This is a 16 dimensional space. Given an arbitrary quartet, consider the $\mathfrak{G}$-torsor generated by it: this is a 6 dimensional smooth submanifold. We should be able to find a 10-dimensional set of invariants whose joint level surface is precisely our torsor. (A good exercise is to try to work this out for Riemannian geometry, i.e. for the manifold of Euclidean frames in a fixed vector space.)

The requirement that in Galilean symmetry a global time is preserved implies that one can find a co-vector $\epsilon_t$ such that for every admissible frame, $\epsilon_t(e_t) = 1$. A bit of short computation shows that, given an initial basis $\{e_t,e_1,e_2,e_3\}$, the $\epsilon_t$ for the $\mathfrak{G}$-torsor generated is uniquely given by the co-vector satisfying two properties: (a) $\epsilon_t(e_t) = 1$ and (b) $\epsilon_t(e_i) = 0$ if $ i = 1,2,3$. Since $\epsilon_t$ is a co-vector, this uses up 4-dimensions of freedom. The other invariant is simply the one corresponding to the $SO(3)$ portion of the structure group. Take the symmetric tensor given by $\sigma = e_1\otimes e_1 + e_2\otimes e_2 + e_3\otimes e_3$. Consider the action of an element of $\mathfrak{G}$ on this tensor: \[ \sigma' = \sum_{i = 1}^3 e'_i\otimes e'_i = \sum_{i,j,k = 1}^3 A_{ij}e_j\otimes A_{ik}e_k = \sum_{i,j,k = 1}^3 A^T_{ki}A_{ij}e_j\otimes e_k = \sum \delta_{kj} e_j\otimes e_k = \sigma \] where we used that $A_{ij}$ is an orthogonal matrix. For an arbitrary quartet, the corresponding $ \sigma$ is by construction a symmetric tensor with 6 degrees of freedom. This completes all the degrees of freedom.

Indeed, one can check that

A set of admissible frames is uniquely specified by a co-vector $\epsilon_t$ and a symmetric two-tensor $\sigma$ such that

  • $ \sigma$, as a quadratic form on the space of co-vectors, is positive semi-definite.
  • $ \epsilon_t$ spans the kernel of $ \sigma$.
Proof [outline]:
Since $ \sigma$ is positive semi-definite and symmetric, we can construct a basis of eigen-co-vectors $ \{ \epsilon_t, \epsilon_1, \epsilon_2,\epsilon_3\}$ where $\epsilon_t$ is as given and $\sigma(\epsilon_i,\epsilon_j) = \delta_{ij}$ if $i,j = 1,2,3$. Let $\{e_t,e_1,e_2,e_3\}$ be the dual basis to $\epsilon_\star$ in the tangent space. It is simple to check that the $ \mathfrak{G}$-torsor generated by this basis is an admissible frame. Next it suffices to check that, with $\epsilon_t$ fixed, any other basis of eigen-co-vectors $\epsilon'_\star$ can be related to the original $\epsilon_\star$ by the dual-action of $\mathfrak{G}$, which implies that the associate vector basis $ e'_\star$ is related to $ e_\star$ by the action of $ \mathfrak{G}$.

Galilean geometry

The problem with the naive formulation given in the first section is that it depends on the fixed 1+3 splitting of the space-time. In particular, the gravitational potential $\Phi$ is obtained by solving the Poisson equation on the three dimensional slice, and is not, a priori speaking, geometric. Furthermore, the statements in \eqref{eq:three} and \eqref{eq:threeprime} relate to a coordinate definition of the connection coefficients, which may or may not change nicely under changes of coordinates. Ideally, we want a fully covariant way of writing the assumptions of the theory.

What we will do, now, is model our construction on the Cartan way (method of moving frames) of formulating Riemannian geometry. (See the Wikipedia article on Frame Bundles for more information.) Now consider an arbitrary 4-dimensional smooth manifold $M$. Associated to the tangent bundle $TM$ is the frame bundle $F_{GL}(M)$ of bases of $TM$. The subscript $GL$ denotes the fact that the most general structure group allows arbitrary changes of bases, which is given by the Lie group $GL(4)$. Any affine connection $\nabla$ can then be described as a $GL(4)$ connection on the principle $GL(4)$-bundle $F_{GL}(M)$. In the case of Riemannian geometry, however, we can restrict to the ortho-normal frame bundle $F_O(M)$ with the group structure given by the orthogonal group $O(4)$ associated to the Riemannian metric. Now, the Riemannian metric is precisely the algebraic invariant for the structure group. A compatible connection must preserve the algebraic invariant across fibers, and so we have that the compatible connection is the Levi-Civita connection under which the Riemannian metric is parallel.

Similarly, to use the local algebraic descriptions developed in the above two sections to extend to an arbitrary smooth manifold, the geometry for a Newtonian gravitational theory should be described by having a connection which preserves the algebraic invariants given in section 3.

A Galilean manifold is the ordered quartet $(M, \epsilon_t, \sigma, \nabla)$ where $M$ is a smooth $(n+1)$-dimensional manifold, $\epsilon_t\in\Gamma T^\star M$ is a one-form, $\sigma\in\Gamma T^{2,0}M$ is a symmetric two-tensor, and $\nabla$ is a torsion-free affine connection such that

  • $\sigma$ is positive semi-definite as a quadratic form on $T^\star M$
  • $\epsilon_t$ spans the kernel of $\sigma$
  • The connection is compatible with the structure invariants, i.e. $\nabla \epsilon_t = 0$ and $\nabla \sigma = 0$.

(N.b. If you compare the definition given in this paper of Brauer, Rendall, and Reula, which formalism the authors cite to Ehlers, what they define as the spatial metric $ h$ is my $ \sigma$, and the time metric $ g$ is equivalent to $ \epsilon_t\otimes\epsilon_t$. At this level the two definitions are equivalent.)

So far we have only described the general geometry of space-time. (Compare to general relativity, what we have done so far is define the analogue of Lorentzian manifolds. We have not specified an equation of motion that couples in matter, e.g. an analogue of Einstein's equation.) Let us consider some direct geometric consequences that we have gotten so far. First, the fact that $\epsilon_t$ is parallel, and that $\nabla$ is torsion free, implies that $d\epsilon_t = 0$. But rather then assuming that we have a suitable topology which allows lifting to a global time function and constructing a time-foliation thereof, we'll instead make the observation that $\epsilon_t\wedge d\epsilon_t = 0$ by definition. Therefore applying Froebenius' theorem, the three-dimensional distribution defined as the kernel of $\epsilon_t$ is integrable. Therefore locally there exists a hypersurface $\Sigma \subset M$ such that $\sigma \in \Gamma T^{2,0}\Sigma$ and defines a Riemannian metric on $\Sigma$. In other words, the underlying geometry already contains an intrinsic 1+3 splitting of the manifold into temporal and spatial directions. Furthermore, we can compute the "extrinsic curvature" of $\Sigma$ in $M$ (this is not exactly the second fundamental form as usually seen in Riemannian geometry per se, but we have some similar notions here).

Recall that the second fundamental form in Riemannian geometry is a measure of how much the ambient parallel transport tend to twist a tangent vector out of its hypersurface. Given that $\Sigma$ is defined by orthogonality to $\epsilon_t$, we can do something similar. Let $V,W\in T_p\Sigma$, consider the object $\epsilon_t(\nabla_VW)$. If it is non-zero this means that the parallel transport given by the connection $\nabla$, when acting on $W$ in the $ V$ direction, will tend to twist $W$ out of the hypersurface $\Sigma$. And as it happens, for $W$ a vector field tangent to $\Sigma$ and $V$ an arbitrary vector, we have \begin{equation}\label{eq:eight} \epsilon_t(\nabla_VW) = V( \epsilon_t(W) ) - (\nabla_V\epsilon_t)(W) = V(0) - 0 = 0. \end{equation} This in particular implies that $\Sigma$ is always totally geodesic in $M$, and that $\nabla$ restricts to a connection on $T\Sigma$ (without projection!). (This captures the statement in \eqref{eq:three} that $\Gamma^0_{aj} = 0$.) Now, restricting $\nabla, \sigma$ to $\Sigma$, we see that we now have an intrinsic connection on a manifold that is torsion free and preserves the Riemann metric. Therefore the restriction of $\nabla$ to $\Sigma$ coincides with the induced Levi-Civita connection. We summarize our computations in the following

If $(M, \epsilon_t, \sigma, \nabla)$ is a Galilean manifold, then $M$ is foliated by hypersurfaces $\Sigma_\tau$ satisfying

  • $ \epsilon_t(T\Sigma_\tau) = 0$ for any $ \tau$,
  • $ \Sigma_\tau$ is totally geodesic with respect to the connection $ \nabla$, and
  • $ (\Sigma_\tau, \sigma)$ is a Riemannian manifold, whose Levi-Civita connection coincides with the connection induced from $\nabla$.

Curvature and Newton's equations

The remainder of \eqref{eq:three}, as well as equations \eqref{eq:four} and \eqref{eq:fourprime} (it turns out that \eqref{eq:threeprime} is subsumed in the others), as we shall see, will be captured in Newton's equations.

First we recall that the Riemann curvature tensor $R_{abc}{}^d$ is still well-defined once we are given an affine connection. Furthermore, tensor contraction affords us also the Ricci curvature tensor. By total geodesy, $\epsilon_t(R(V,W)U) = 0$ if $V,W,U\in T\Sigma$, and the restriction of the curvature tensor to $\Sigma$ agrees with the intrinsic curvature given by the Riemannian metric $\sigma$. (Essentially a version of the Gauss and Codazzi equations.)

Also, it is important to note that as $\nabla$ is assumed to be torsion free, the first Bianchi identity still holds via the Jacobi identity. That the second Bianchi identity still holds is a general property of affine connections.

A Newton-Cartan theory consists of a Galilean manifold $(M, \epsilon_t, \sigma, \nabla)$ and a symmetric (2,0)-tensor $T[\Psi]$ corresponding to the stress-energy for some matter fields with the requirement that

  • The matter equation is satisfied $\nabla_aT^{ab} = 0$.
  • The Newtonian gravity coupling (with cosmological constant) $\mathrm{Ric} = [4\pi T(\epsilon_t,\epsilon_t)- \Lambda ] \epsilon_t\otimes \epsilon_t$ holds.
  • The symmetry property $R(\sigma(\cdot),V)W$ is a symmetric (2,0)-tensor for any vectors $V,W$.

The first condition gives the minimally coupled evolution equation for the matter fields. The second condition is a generalization of \eqref{eq:fourprime}, where the "temporal" component of Ricci is taken to be equal to the matter energy density minus cosmological constant. The third condition is a generalization of \eqref{eq:four}.

Let us consider the consequences of the definition: the second condition implies that $\mathrm{Ric}(V,W) = 0$ for any $V,W$ tangent to $\Sigma$. This directly implies that $(\Sigma,\sigma)$ is Ricci-flat. The third condition, by total geodesy, implies the same condition holds for the curvature operator of $\Sigma$. Using the induced metric on $\Sigma$ to raise and lower indices, the condition becomes $R^{(\sigma)}_{ijkl} = R^{(\sigma)}_{ilkj}$. Applying the first Bianchi identity we see that

In a Newton-Cartan theory, the spatial hypersurfaces $\Sigma_\tau$ with the Riemannian structure induced by $\sigma$ are flat.

Symmetries of space-time

Now let's dig into the geometry a bit more. We start by making precise what we mean by a symmetry

A symmetry of a Galilean manifold $(M, \epsilon_t,\sigma,\nabla)$ is a diffeomorphism $f:M\to M$ such that it preserves the geometric structures: that the pullback $f^\star\epsilon_t = \epsilon_t$, the pushforward $ f_\star\sigma = \sigma$, and the pullback connection $f_\star\nabla = \nabla$.

If we have a one-parameter family of symmetries continuous to the identity, we can define an infinitesimal symmetry

An infinitesimal symmetry of a Galilean manifold is a vector-field $X$ on $M$ such that $\mathcal{L}_X\epsilon_t = 0$, $\mathcal{L}_X\sigma = 0$, and $[ \mathcal{L}_X, \nabla ] = 0$, where $ \mathcal{L}_X$ denote the Lie derivative by $ X$ and square-brackets denote commutator of operators.

Now, by the Cartan relation, $ \mathcal{L}_X\epsilon_t = i_X d\epsilon_t + d(i_X\epsilon_t)$. By assumption $\epsilon_t$ is closed, and this shows that \begin{equation} \label{eq:2:3} \mathcal{L}_X\epsilon_t = 0 \iff \epsilon_t(X) \equiv \text{ constant}. \end{equation} On the other hand, using the definition \[ (\mathcal{L}_X K)^{ab} = (\nabla_XK)^{ab} - K^{cb}\nabla_cX^a - K^{ac}\nabla_cX^b \] and that $\sigma$ is parallel and symmetric, this implies \begin{equation}\label{eq:2:4} \mathcal{L}_X\sigma = 0 \iff \sigma \circ \nabla X \text{ is an anti-symmetric (2,0)-tensor (i.e. a bivector)} \end{equation} Equation \eqref{eq:2:4} is an analogue of Killing's equation.

Equations \eqref{eq:2:3} and \eqref{eq:2:4} together imply that if $X$ is a symmetry and it is tangent to a spatial hypersurface $\Sigma_\tau$ at one point, then $X$ is always tangent to spatial hypersurfaces and that $X |_{\Sigma_\tau}$ is a Killing vector field. This is also evident from Definition 6, where one immediately sees that by the construction of a Galilean manifold, a symmetry map $f$ will carry hypersurface $ \Sigma_\tau$ to another hypersurface $\Sigma_{\tau'}$.

We can also prove a version of the familiar equation in pseudo-Riemannian geometry relating the Riemann curvature tensor to the second derivative of a Killing vector field (see, for example, Appendix C.3 of Robert Wald's General Relativity). By definition, for arbitrary vector fields $ X,U,V$ \[ [\mathcal{L}_X,\nabla]V(U) = [X,\nabla_UV] - \nabla_{[X,U]}V - \nabla_U[X,V] \] where on the right hand side the square-brackets now denote the Lie bracket. Using that $\nabla$ is torsion free, we re-group the terms and find that \begin{equation}\label{eq:2:5} [\mathcal{L}_X,\nabla] = 0 \iff R(X,U)V = \nabla_{\nabla_UV}X - \nabla_U\nabla_VX =: -\nabla^2_{U,V}X \end{equation} and thus, as in the pseudo-Riemannian case, a symmetry vector field is uniquely determined by its value and the value of its covariant derivatives at one point.

Volume form and Stokes theorem

First recall Stokes theorem from differential topology: given an orientable smooth $k$ dimensional manifold $N$ with boundary $\partial N$, and given a $(k-1)$-form $ \eta$, Stokes theorem tells us that \[ \int_N d\eta = \int_{\partial N} \eta \] where on the right hand side we use the induced orientation on $\partial N$. From this, we can derive a dual-version: the divergence theorem. Now let $\eta$ denote, instead, a volume form on $N$ (remember that the set of volume forms is a torsor over the multiplicative group of positive smooth functions; there are infinitely many to choose from). Consider a vector field $ X$ on $ N$, then the interior derivative $i_X\eta$ is a $ (k-1)$-form. Apply Stokes theorem to this form, and we have \[ \int_{\partial N} i_X\eta = \int_N d(i_X\eta) = \int_N(\mathcal{L}_X\eta - i_X d\eta)\] where in the last equality we used the Cartan relation again. Since $ \eta$ is a top form, it is automatically closed, so the integrand on the far right simplifies to $ \mathcal{L}_X\eta$. Now, since $\Lambda^k(T^\star N)$ is one-dimensional over the ring of smooth functions, and that the units in this ring corresponds to non-vanishing functions, we have that $ \mathcal{L}_X\eta = F\eta$ for some smooth function $ F$. This way we define the divergence of the vector field $X$:

Let $ N$ be a smooth manifold and $ \eta$ a volume form. For a given vector field $ X$, the divergence of $ X$ relative to $ \eta$, written $ (\mathop{div}_\eta X)$ is the unique function given by $ \mathcal{L}_X\eta = (\mathop{div}_\eta X)\eta$.

Notice that the definition does not depend on a connection! Furthermore it depends on the choice of volume form. Let $ \eta' = e^u\eta$ be another volume form. Then a computation shows that \[ \mathcal{L}_X\eta' = X(e^u)\eta + e^u\mathcal{L}_X\eta = (Xu + \mathop{div}_\eta X)\eta'. \]

On the other hand, we can also define the object $\nabla_a X^a = c(\nabla X)$, the tensor contraction of the covariant derivative of $X$. This function, for a fixed vector field, only depends on the choice of connection. Now, in 3D vector calculus, the two definitions are equal. In general, we have that

Given a torsion-free connection $\nabla$ and a volume form $\eta$. Then the following two conditions are equivalent

  1. $ \nabla\eta = 0$
  2. $ c(\nabla X) = \mathop{div}_\eta X$ for all vector fields $ X$.
Let $ e_1,\ldots,e_k \in TN$. A direct calculation shows that \[ \mathcal{L}_X\eta(e_1,\ldots,e_k) = (\nabla_X\eta)(e_1,\ldots,e_k) + \sum_{j = 1}^k \eta(e_1,\ldots, e_{j-1},\nabla_{e_j}X, e_{j+1},\ldots e_k) \] Now if $ \{e_j\}$ are linearly dependent, then both sides evaluate simply to zero. So assume $ \{e_j\}$ form a basis. Now $ \nabla X\in T^{1,1}M$ can be regarded as a linear map from vector fields to vector fields. In the basis we have, it can be represented by some matrix $ A_{ij}$ so that $ \nabla_{e_j}X = \sum A_{jm}e_m$. Now, plugging this expression into the above equation, we notice that if $ j\neq m$, then by linearity ($A_{jm}$ is just a real number) and anti-symmetry, the form evaluates to zero. So we can simplify \[ \mathcal{L}_X\eta(e_1,\ldots,e_k) = (\nabla_X\eta)(e_1,\ldots,e_k) + (\mathop{Tr} A)\eta(e_1,\ldots,e_k). \] Lastly observe that the trace of a linear operator is basis independent, and in fact is, by definition, the tensor contraction, we have that \[ (\mathop{div}_\eta X - c(\nabla X))\eta = \nabla_X\eta \] and the lemma follows.

Observe that this Lemma is why the divergence theorem is stated so simply in pseudo-Riemannian geometry: the metric-induced volume form is by definition parallel relative to the Levi-Civita connection.

Now we specialise to Galilean geometry.1

Let $(M,\epsilon_t,\sigma,\nabla)$ be a Galilean $(n+1)$-dimensional manifold, and assume that $M$ is orientable. Then there exists a preferred volume form $\mathrm{Vol}$ with the property that $\nabla \mathrm{Vol} = 0$.

The form $ \mathrm{Vol}$ will be, morally speaking, given by $\epsilon_t\wedge \mu_\sigma$, where $ \mu_\sigma$ is any $ n$-form such that its restriction to the spatial hypersurfaces $ \Sigma_\tau$ agrees with the volume form given by the induced Riemannian metric. We show such a volume form exists. Essential uniqueness follows from the fact that $\nabla (f\mathrm{Vol}) = \nabla f \mathrm{Vol} + f \nabla\mathrm{Vol}$, so that any other parallel volume form must be a constant multiple of the one we construct.

It suffices to show that this volume form exists locally. Suppose we defined such forms on two overlapping neighborhoods $ U,V$. That the two forms both restrict to $ \Sigma_\tau\cap U\cap V$ as the induced volume form by the Riemannian metric means that we have a preferred normalisation by construction, and hence on $U\cap V$ the two forms agree. A partition of unity argument then allows us to patch the local definitions to a global one. Now, fix a local neighborhood $U$ and pick a co-frame $\{\epsilon_0,\epsilon_1,\ldots, \epsilon_n\}$ where $ \epsilon_0 = \epsilon_t$ and that $ \sigma(\epsilon_i,\epsilon_k) = \delta_{ij}$ if $ 1\leq i,j\leq n$. The connection can be expressed in the method of moving frames \[ \begin{gathered} \nabla \epsilon_0 = 0 \newline \nabla \epsilon_i = \sum_j \omega^j_i\otimes\epsilon_j + \omega^0_i\otimes\epsilon_0 \end{gathered} \] where the $ \omega^\star_\star$ are real-valued one-forms called the rotation coefficients. Now, we use the fact that $ \sigma$ is parallel \[ 0 = \nabla[\sigma(\epsilon_i,\epsilon_j)] = \sigma(\epsilon_i,\nabla\epsilon_j) + \sigma(\nabla\epsilon_i,\epsilon_j) = \omega^i_j + \omega^j_i\] so the $ \omega^\star_\star$ are anti-symmetric in the indices.

Now consider the top form $\epsilon_0\wedge \cdots \wedge\epsilon_n$. Its covariant derivative is, after substituting in the rotation coefficients, \begin{multline} \nabla (\epsilon_0\wedge \cdots \wedge\epsilon_n) = \newline \sum_{j = 1}^n (-1)^j \Bigl[ \sum_{k = 1}^n \omega^k_j\otimes (\epsilon_k\wedge \epsilon_0 \wedge \cdots \hat{\epsilon}_j \cdots \wedge \epsilon_n) \newline + \omega^0_j\otimes (\epsilon_0\wedge \epsilon_0 \wedge \cdots \hat{\epsilon}_j \cdots \wedge \epsilon_n) \Bigr] \qquad \end{multline} where the hat above a co-frame component means that the component is omitted. It is simple to see that many of the terms drop-out by anti-symmetry, and in the end we are left with \[ \nabla(\epsilon_0\wedge \cdots \wedge\epsilon_n) = \sum_{j=1}^{n} \omega^j_j \otimes (\epsilon_0\wedge \cdots \wedge\epsilon_n). \] Now, using the anti-symmetry of the indices of $ \omega^\star_\star$, we see that the right hand side vanishes identically. Therefore $ \mathrm{Vol} = \epsilon_0\wedge \cdots \wedge\epsilon_n$ is parallel. To finish the proof we need to show that $\mu_\sigma = \epsilon_1\wedge \cdots \wedge \epsilon_n$ restricts to the volume form on $ \Sigma_\tau$ (without a fixed normailsation, we cannot use the partition of unity argument unless $M$ is simply connected). But this follows from definition: by construction $\sum \epsilon_i\otimes\epsilon_i |\Sigma_\tau$ is precisely the induced Riemannian metric.

Now, combining Proposition 10, Lemma 9, and the Stokes theorem, we see the following fact (assuming all the integrals converge):

Let $X$ be a vector field on a Galilean manifold $(M,\epsilon_t,\sigma,\nabla)$, and let $\mathrm{Vol}$ as defined as in Proposition 10. Let $U$ be a region in $ M$ such that its boundary is the disjoint union of $\Sigma_\tau$ and $ \Sigma_{\tau'}$ with the latter "to the future" of the former. Then the divergence theorem takes on the form \[ \int_U c(\nabla X) \mathrm{Vol} = \int_{\Sigma_{\tau'}} \epsilon_t(X) \mu_\sigma - \int_{\Sigma_{\tau}} \epsilon_t(X) \mu_\sigma \] where $ \mu_\sigma$ is the volume form associated to the induced Riemannian metric on spatial slices.

The right hand side of the above equation follows because the restriction $\epsilon_t | \Sigma_\tau = 0$.

In particular, the above proposition implies

Corollary    [Conservation of energy]
Given a solution of the Newton-Cartan system, the total matter energy $E = \int_\Sigma T(\epsilon_t,\epsilon_t) \mu_\sigma$ is conserved.
Apply Proposition 11 to $X = T(\epsilon_t, \cdot)$ and note that $ c(\nabla X) = 0$ using that $ \epsilon_t$ is parallel and that the stress-energy tensor is divergence free.

Matter Model: non-interacting particles

As a simple example of a physical theory on a Galilean manifold, let us consider the physics of a collection of massive particles that do not interact except for their gravitational interaction. In other words, let us consider a collisionless kinetic theory coupled to Newtonian gravity.

The Vlasov model

The Vlasov system is a transport equation describing the free flow of collisionless particles. Let $(M,\nabla)$ be a manifold with an affine connection that represents the space-time. We postulate Newton's first law:

Physical Assumption    [Newton's First Law]
The motion of a free particle is geodesic.

Therefore the motion of a free particle is described by the following system of equations: let $\tau$ denote proper time as experienced by the particle, and $\gamma(\tau)$ the world-line of the particle (its spacetime trajectory) parametrized by $\tau$, then we have the hyperbolic system of equations.

\begin{equation}\label{eq:3:2a} (\frac{d}{d\tau}\gamma)(\tau) = V\circ\gamma(\tau) \in T\gamma \subset TM \end{equation} and \begin{equation}\label{eq:3:2b} \frac{d}{d\tau}(V\circ\gamma) (= \frac{d^2}{d\tau^2}\gamma) = \nabla_VV = 0. \end{equation}

Now let us try to generalize this to a collection of particles. In fact, we'll take the continuum limit. Instead of looking at a single particle as a single trajectory in phase space (which is equal to $TM$), we'll consider a entire collection as a distribution on phase space. First, recall that the tangent bundle $TM$ of any manifold is automatically orientable; this allows us to choose a volume form $\varpi$ on $TM$. Relative to this volume form, we will describe the particle distribution by a function $f: TM\to [0,\infty)$, representing the phase-space density of the particles. Given a point $(p,V)\in TM$ and a neighborhood $N$ thereof, the number of particles with phase-space parameters in $N$ is given by $\int_N f \varpi$. So we interpret $f(p,V)$ as the relative density of particles sitting at point $p$ with velocity $V$.

The evolution of the function $f$ can be described in the language of a geodesic spray. The geodesic equations \eqref{eq:3:2a} and \eqref{eq:3:2b} allows us to define a mapping2 $\Psi_t: TM \to TM$ mapping $(p,V)$ to the solution of the geodesic equation evaluated at time $t$, with $\Phi_0$ being the identity map. The generator of this family of diffeomorphisms at $t = 0$ is some vector field $\mathscr{X}$ over $TM$, which we call the geodesic spray associated to the connection $\nabla$.

Physically, if all particles are to move in accordance to Newton's first law, then we expect that the particle distribution to be invariant under the geodesic flow. In other words, we expect the function $f$ to satisfy \begin{equation}\label{eq:3:4} \mathcal{L}_{\mathscr{X}} (f \varpi) = 0 \end{equation} Equation \eqref{eq:3:4} is the free Vlasov equation on the fixed spacetime $ (M,\nabla)$.

Let's quickly return to the issue of the volume form $\varpi$ on $TM$. By Theorem 9 in our companion post, we see that when $(M,\nabla, \omega)$ is equi-affine, then there is a canonical choice of a volume form $\varpi$ on $TM$ such that $\mathcal{L}_{\mathscr{X}} \varpi = 0$. And in this setting \eqref{eq:3:4} reduces to the statement that $\mathscr{X}(f) = 0$.

Coupling to Galilean geometry

Now we make another physical assumption, assuming we are now working with a Galilean manifold $ (M,\epsilon_t,\sigma,\nabla)$

Physical Assumption  
Time flows equally for all particles. In other words, the only allowed velocities are those for which $ \epsilon_t(V) = +1$.

(N.b. In Einstein-Vlasov theory, where the equations are coupled to general relativity, the assumption is that particle velocity is described by a future-pointing, time-like vector with unit norm, which says that in the rest frame times flow the same way.)

The set of valid velocities we will denote by $\mathcal{V}\subset TM$, which we call the mass shell. Then the Vlasov equation on a Galilean manifold with velocities restricted to the mass shell is hyperbolic relative to the spatial hypersurfaces $ \Sigma_\tau$: Let $ \pi: TM\to M$ be the canonical projection map, and $ \iota: \mathcal{V}\hookrightarrow TM$ the canonical inclusion, and take $ \tilde\Sigma_\tau := (\pi\circ\iota)^{-1}\Sigma_\tau$. Now, given an arbitrary admissible velocity $ V$ at some point $ p$, we develop the geodesic associated to it $ \gamma$. But now observe that due to the structure $ \nabla\epsilon_t = 0$, we have that $ \epsilon_t(\dot\gamma) = \epsilon_t(V) = 1$ along $ \gamma$. Therefore the geodesic flow carries $ \mathcal{V}$ to itself. And this implies that $\mathscr{X}$ is tangent to $ \mathcal{V}$. (N.b. in the Einstein-Vlasov case, the analogous mass shell is also preserved under geodesic flow.) So by an abuse of notation we will use $\mathscr{X}$ to also denote the vector field on $ \mathcal{V}$ that generates the Vlasov evolution. Now, it is clear that by definition of $ \mathcal{V}$ that $\mathscr{X}$ cannot be tangent to $\tilde\Sigma_\tau$ (this is somewhat related to Lemma 3 above), therefore the first-order transport equation $ \mathscr{X}f$ is locally well-posed, and the Vlasov system on the mass shell is thus a hyperbolic system.

In fact, the Vlasov system on $ \mathcal{V}$ is integrable, since it merely states that $f$ is constant along the integral curves of $\mathscr{X}$.

Gravitational coupling

Now we need to fully couple to the Newton-Cartan theory. In view of Definition 4 above, to define how the particles produce gravity, it suffices that we write down a stress-energy tensor $T[f]$. For this I'll just appeal to physical intuition and not explain too deeply why this tensor is the appropriate one. To start with, we'll take the stress-energy tensor associated to a point-particle of mass $m$ and space-time velocity $V$ to be $ \frac12 m V\otimes V$.

Now, our distribution function $f$ is a function on $ \mathcal{V}$, but we need to extract from it a tensor on $M$, to do so we need to somehow "integrate out the fibre". Let $\mathcal{V}_p$ denote the fibre of $\mathcal{V}$ over a point $ p\in M$. While $\mathcal{V}$ is not a vector bundle, $\mathcal{V}_p$ has the structure of a projective space. But more importantly, we observe that, given any vector $ Y\in T_p\Sigma_\tau$ and $ V\in \mathcal{V}_p$, we have that $ \epsilon_t(Y+V) = 0 + 1 = 1$ and so $ Y+V$ is also in $ \mathcal{V}_p$. Now, using that $ \sigma$ induces on $ \Sigma_\tau$ a Riemannian structure, we see that $ \mathcal{V}_p$ admits a flat, Euclidean metric. This metric induces a volume form $ dvol(\mathcal{V}_p)$. The key point is that this volume form is invariant under local symmetry transformations. And hence we can use this to define integrals on $ \mathcal{V}_p$. (N.b. in the Einstein-Vlasov case, the mass shell acquires the structure of hyperbolic space.)

(In terms of the language developed earlier, $\mathcal{V}_p$ is $n$ dimensional. The parallel volume form $\omega =\epsilon_t\wedge \mu_\sigma$ is however $n+1$ degree. The volume form on $\mathcal{V}_p$ is simply the vertical lift of $\iota_V \omega$.)

So now we can write down the stress-energy tensor \begin{equation}\label{eq:3:7} T[f](p) = \frac12 \int_{\mathcal{V}_p} V\otimes V f(p,V) dvol(\mathcal{V}_p) \end{equation} which is indeed divergence free.

Now, using the fact that each $ \Sigma_\tau$ is flat, we can pick a geodesic coordinate system around a point $ p$ in the following way: locally since $ \epsilon_t$ is closed, we can lift to a time function $ \tau$ such that $ \Sigma_\tau$ are level sets. Now, pick $ V\in \mathcal{V}_p$, and let $ \gamma_0$ denote its associated geodesic. (Note that $ \nabla_{\dot\gamma}\tau = 1$ by construction.) We'll call the points where $ \gamma_0\cap \Sigma_\tau$ the "origin" for each slice. Now, at $ p$ choose $ e_i$ an orthonormal basis of $ \Sigma_\tau$. Extend $ e_i$ to vector fields along $ \gamma_0$ by parallel transport. Now using each spatial slice is flat, we can locally build a coordinate system $ (\tau = x^0, x^1, \ldots, x^n)$ such that the induced metric on $\Sigma$ is Euclidean, and such that $ \partial_i |_{\gamma_0} = e_i$.

Now, recalling the Christoffel symbols $ \Gamma_{\alpha\beta}^\gamma \partial_\gamma = \nabla_{\partial_\alpha} \partial_\beta$. The flatness and totally geodesy of $ \Sigma$ implies $ \Gamma_{ij}^\gamma = 0$. Using that $ \epsilon_t(\partial_0) = 0$ we have $ \Gamma_{\alpha 0}^0 = 0$. A direct computation then yields \begin{align} \mathrm{Ric}_{00} &= \partial_i\Gamma^i_{00} - \partial_0\Gamma^i_{i0} - \Gamma_{i0}^j\Gamma_{0j}^i \newline \mathrm{Ric}_{i0} &= \partial_j\Gamma^j_{i0} - \partial_i\Gamma^j_{j0} \end{align} Evaluating at point $ p$, where by construction we have $ \Gamma^i_{j0} = \partial_0\Gamma^i_{j0} = 0$, the last two terms of the first equation drops out. Now, using also that $ \Gamma_{00}^\alpha = 0$ along $ \gamma_0$, we have that $ \partial_i\Gamma^{j}_{00} = \partial_j\Gamma^i_{00}$. So writing $ E^i = \Gamma^i_{00}$, we see that the gravitational coupling formally reduces to the equations \begin{equation} \mathop{curl}_{\Sigma} E = 0~,\quad \mathop{div}_{\Sigma} E = 2\pi \int_{\mathcal{V}_p} f(p,V)dvol - \Lambda \end{equation} which, in the case where there is no cosmological constant, is in the same form as the Vlasov-Poisson equations. (In the special case where our initial data admits a vector field $ v_0$ along $ \Sigma_0$ with $ \epsilon_t(v_0) = 1$ and such that for any $ w\in T\Sigma_0$ we have $ \nabla_w v_0 = 0$, we can actually make this precise.)

  1. The Proposition below is discussed in more detail in a companion post to this one, concerning parallel volume forms in affine differential geometry. ^
  2. Technically, for $t\neq 0$ the mapping may not be defined on all of $TM$, when $(M,\nabla)$ is not geodesically complete. Over compact subsets of $K \Subset TM$ this mapping is is however well-defined (mapping into $TM$) for all sufficiently small $t$, and therefore the corresponding infinitesimal transformation is still well-defined. We will throughout ignore this technical point. ^
Willie WY Wong
Assistant Professor

My research interests include partial differential equations, geometric analysis, fluid dynamics, and general relativity.