*Remark: I wrote this note mainly as a way to put thought to paper and make sure I understand the basic notions of curvature and torsion of a general linear connection on the tangent bundle. The intended audience is therefore myself. I'll however be happy to answer any questions if they arise.*

## Basics / Review

Let's start with a smooth manifold \(M\).
The tangent bundle \(TM\) attaches to each \(p\in M\) a vector space \(T_p M\) representing the tangent directions.
One of the fundamental starting points of differential geometry is the fact that, while for two points \(p,q\in M\) the tangent spaces \(T_p M\) and \(T_q M\) are isomorphic as vector spaces, there is no canonical isomorphism between them.
In order to do analysis (calculus) on a manifold, specifically concerning "how vector fields change", we need to prescribe how to compare tangent spaces at different points.
This is done through the notion of a *connection* (or equivalently, a *parallel transport*).

Geometry, then, is the study of how this *connection* differs, in various properties, from what we expect from the classical Euclidean geometry.

## Probing geometry with loops

One way to probe the difference between geometries is to examine the behavior of "small loops".

First fix a base point \(p\in M\).
Let \(\gamma:[0,1] \to M\) be a curve that begins *and* ends at \(p\).
We wish to compare the "shape" of the loop \(\gamma\) with a "similar loop" in Euclidean geometry.
One way to effect this comparison is by building a "loop" in a Euclidean space using the parallel transport.

Denote by \(\phi_s\), for \(s\in [0,1]\), the parallel transport (along \(\gamma\)) of the tangent space \(T_p M\) to \(T_{\gamma(s)} M\). This map is an invertible linear operator between vector spaces. Now, the "velocity" of the curve, \(\dot{\gamma}(s)\) is an element of the tangent space \(T_{\gamma(s)} M\). So pulling back with the parallel transport operator we get a function \(b: [0,1] \to T_p M\), given by \[ b(s) := \phi_s^{-1} \dot{\gamma}(s).\]

One way to interpret this function \(b(s)\) is to think in terms of a "compass". What the parallel transport gives us, is that it allows us to "carry a frame" with us when we travel along a curve. This is analogous to carrying a compass; except, in our setting, not only is there no concept of "true north", how the compass needle moves depends on which path we take! For a traveller, since we don't have an absolute notion of direction, the best we can record is our speed and heading (as measured relative to the compass that we are carrying with us). This recording is the function \(b(s)\).

Given \(b(s)\), we can now reconstruct a path, which we will call \(g(s)\), within the tangent space \(T_p M\), using the formula
\[ g(s) := \int_0^s b(\tau) ~\mathrm{d}\tau.\]
Since \(T_p M\) is a linear space we can *canonically* identify its tangent space at every point with itself; and we see that the curve \(g(s)\) has the property that its speed and heading is given by \(b(s)\).
In other words, \(g(s)\) is the reconstruction of the path \(\gamma(s)\) in a space with Euclidean geometry.

*While \(\gamma\) is a loop, \(g\) is not necessarily so!*

In fact, this is what distinguishes the manifold \(M\) with its connection from a Euclidean space. To examine this in more detail it is convenient to introduce a coordinate system on \(M\) so we can do computations.

## Coordinate representation

Fix now a coordinate system in a neighborhood of the point \(p\), which we denote by \((x^\mu)\). For convenience we will assume that the origin of our coordinate system is exactly the point \(p\). The vector fields corresponding to partial differentiation \( (\partial_\mu)\) give a basis of the tangent spaces. With respect to this basis, we can write a matrix valued function \(A^\mu_\nu(s)\) defined as the coordinate representation of the linear map given by the inverse parallel transport (Here \(\gamma^\nu\) is the \(x^\nu\) component of the curve \(\gamma\)): \[ (\phi_s^{-1} \dot{\gamma}(s))^\mu = A^\mu_\nu(s) \dot{\gamma}^\nu(s).\] In this coordinate system we have then \[\begin{aligned} g(1) &= \int_0^1 A^\mu_\nu(s) \dot{\gamma}^\nu(s) ~\mathrm{d}s \newline & = A^\mu_\nu(1) \gamma^\nu(1) - A^\mu_\nu(0) \gamma^{\nu}(0) - \int_0^1 \dot{A}^\mu_\nu(s) \gamma^\nu(s) ~\mathrm{d}s \end{aligned}\] by integrating by parts. By our assumption since \(\gamma(0) =\gamma(1) = p\) we have that \(\gamma^{\nu}(0) = \gamma^{\nu}(1) = 0\), and the first two terms in the last line of the above expression drops out. Note also that by definition \(g(0) = 0\), and that \(A^\mu_\nu(0) = \mathrm{Id}\).

An immediate consequence is that **if \(\dot{A}^\mu_\nu \equiv 0\), then \(g(1) = 0\).**
Recalling that \(A^\mu_\nu\) is the linear transformation between tangent spaces at two different points on \(\gamma\), we see that when it doesn't change along \(\gamma\), we are in a situation where the coordinate frame is parallel along the curve, and hence from the point of view of the curve the geometry is flat.

To see what is the value of \(g(1)\) in general, it helps to rewrite \( \dot{A}^\mu_\nu(s)\).

Writing \(\nabla\) for the covariant differentiation operator corresponding to our connection,

in terms of the coordinate system, we have that
\[ \nabla_{\partial_\mu} (V^\nu \partial_\nu) = (\partial_\mu V^\nu) \partial_\nu + \Gamma^\nu_{\mu\lambda} V^\lambda \partial_\nu, \]
where the \(\Gamma\) are the connection coefficients.
Using this definition, and the definition of the parallel transport operator \(\phi_s\), we see that
\[ \dot{A}^\mu_\nu(s) = A^\mu_\lambda(s) \Gamma^\lambda_{\rho \nu}(\gamma(s)) \dot{\gamma}^\rho(s).\]
And so
\[ g(1) = - \int_0^1 A^\mu_\lambda(s)\, \Gamma^\lambda_{\rho \nu}(\gamma(s))\, \dot{\gamma}^\rho(s)\, \gamma^\nu(s) ~\mathrm{d}s.\]

## Shrinking the loops

To capture the "local" geometry near the point \( p\), we want to use small loops. One way to accomplish this is to first fix the coordinate system as above, and consider the family of loops \[ \gamma^\mu(s;\epsilon) = \epsilon \gamma^\mu(s) \] and its corresponding \(g(s;\epsilon)\). For this family, we see that \[ \gamma^\mu(s;\epsilon), \dot{\gamma}^\mu(s;\epsilon) = O(\epsilon). \]

Within this small neighborhood, we expect the connection coefficients to be all bounded \[ \Gamma^\mu_{\nu\lambda} = O(1) \] and so the differential equation satisfied by \(A^\mu_\nu(s;\epsilon)\) reads \[ \dot{A} = A \cdot O(1) \cdot O(\epsilon) \] hence integrating we get \[ A^\mu_\nu(s;\epsilon) = O(1).\] Hence we have that \[ g(1;\epsilon) = O(\epsilon^2).\]

The fact that the gap closes faster than \(O(\epsilon)\) is in line with our expectations that, on very small scales, the geometry of our manifold should not look that different from Euclidean geometry.

We can analyze this a bit further. We can split the connection coefficients into two parts: \[ \Gamma^\mu_{\nu\lambda} = \tilde{\Gamma}^\mu_{\nu\lambda} + \hat{\Gamma}^\mu_{\nu\lambda} \] where \(\tilde{\Gamma}^\mu_{\nu\lambda} = \tilde{\Gamma}^\mu_{\lambda\nu}\) is symmetric in the two lower indices, and \(\hat{\Gamma}^\mu_{\nu\lambda} = - \hat{\Gamma}^\mu_{\lambda\nu}\) is antisymmetric in the two lower indices. We can analyze their contributions to \(g(1;\epsilon)\).

First, the symmetric part:
\[ \tilde{g}(1;\epsilon) = - \int_0^1 A^{\mu}_\lambda(s;\epsilon) \tilde{\Gamma}^\lambda_{\rho\nu}(\gamma(s;\epsilon)) \dot{\gamma}^\rho(s;\epsilon) \gamma^\nu(s;\epsilon) ~\mathrm{d}s. \]
Using the symmetry between the \(\rho\) and \(\nu\) indices, this can be rewritten as
\[ \tilde{g}(1;\epsilon) = - \frac12 \int_0^1 A^{\mu}_\lambda(s;\epsilon) \tilde{\Gamma}^\lambda_{\rho\nu}(\gamma(s;\epsilon)) \frac{\mathrm{d}}{\mathrm{d}s} \left[ \gamma^\rho(s;\epsilon) \gamma^\nu(s;\epsilon)\right] ~\mathrm{d}s. \]
Hence we can integrate by parts once more to get
\[ \tilde{g}(1;\epsilon) = - \frac12 \left[ A^\mu_\lambda(s;\epsilon) \tilde{\Gamma}^\lambda_{\rho\nu}(\gamma(s;\epsilon)) \gamma^\rho(s;\epsilon) \gamma^\nu(s;\epsilon) \right]_{s = 0}^1 + \frac12 \int_0^1 \frac{\mathrm{d}}{\mathrm{d}s} \left[ A^{\mu}_\lambda(s;\epsilon) \tilde{\Gamma}^\lambda_{\rho\nu}(\gamma(s;\epsilon))\right] \gamma^\rho(s;\epsilon) \gamma^\nu(s;\epsilon) ~\mathrm{d}s. \]
For the boundary terms, again using that \(\gamma(0) = \gamma(1) = p\) we see that they must vanish.
Inside the integrand we get
\[ \frac{\mathrm{d}}{\mathrm{d}s} \left[ A^{\mu}_\lambda(s;\epsilon) \tilde{\Gamma}^\lambda_{\rho\nu}(\gamma(s;\epsilon))\right] =
A^\mu_{\lambda}(s;\epsilon) \left[ \Gamma^{\lambda}_{\tau\sigma}(\gamma) \tilde{\Gamma}^\sigma_{\rho\nu}(\gamma) + (\partial_{\tau} \tilde{\Gamma}^\lambda_{\rho\nu})(\gamma) \right] \dot{\gamma}^\tau(s;\epsilon).\]
For now, what's most important is that this formula implies that the integrand in the formula for \(\tilde{g}(1;\epsilon)\) has three factors of \(\gamma\) and hence is order \(O(\epsilon^3)\). In particular, *the symmetric part of the connection does not contribute to the leading order difference \(g(1;\epsilon) - g(0;\epsilon)\).* We will return to this a little bit later.

Now, let us look at the antisymmetric part: the integrating by parts trick from above doesn't work. And we do expect that there is an order \(\epsilon^2\) contribution. To see that this is the case, it is helpful to have an example.

### Example: pure torsion space

Let our manifold \(M\) be \(\mathbb{R}^3\) with its standard coordinate system. Suppose the connection coefficients are given by the constants \[ \Gamma^1_{23} = - \Gamma^1_{32} = \Gamma^2_{31} = -\Gamma^2_{13} = \Gamma^3_{12} = - \Gamma^3_{21} = 1 \] and the remaining 21 components are all assumed to vanish. (Note the antisymmetry!)

For convenience we will assume (ignoring differentiability issues) that \(\gamma\) traces out the square of sidelength \(1/4\) in the \(x^2\)-\(x^3\) plane, with the obvious parametrization. Then, between \(s\in [0,1/4)\), we have that \[ b(s;\epsilon) = \dot{\gamma}(s;\epsilon) = \epsilon \cdot (0,1,0).\] Along this segment the matrix \( A^\mu_\nu(s;\epsilon) \) looks like \[ \begin{pmatrix} \cos(\epsilon s) & & \sin(\epsilon s) \newline & 1 & \newline -\sin(\epsilon s) & & \cos(\epsilon s) \end{pmatrix}. \] We see that coordinate system is seen as "twisting" around the segment.

Between \(s\in (1/4,1/2)\), we have \(\dot{\gamma} = \epsilon \cdot (0,0,1)\). The corresponding image in \(T_pM\) is \[ b(s;\epsilon) = \epsilon \cdot (\sin(\epsilon / 4), 0, \cos(\epsilon/4)). \] Integrating the equation for the derivative of \(A\), we have that the matrix \(A^\mu_{\nu}(s;\epsilon)\) on this interval looks like \[ \begin{pmatrix} \cos(\epsilon /4) & & \sin(\epsilon /4) \newline & 1 & \newline -\sin(\epsilon /4) & & \cos(\epsilon /4) \end{pmatrix} \cdot \begin{pmatrix} \cos(\epsilon (s - 1/4)) & -\sin(\epsilon(s-1/4)) \newline \sin(\epsilon(s-1/4)) & \cos(\epsilon (s-1/4)) \newline & & 1 \end{pmatrix} \] This implies that between \(s\in (1/2,3/4)\) \[ b(s;\epsilon) = \epsilon \cdot \begin{pmatrix} \sin(\epsilon / 4) \cos(\epsilon / 4) \newline - \cos(\epsilon / 4) \newline - \sin^2(\epsilon / 4) \end{pmatrix}.\] A similar final computation gives that between \(s \in (3/4,1]\) \[ b(s;\epsilon) = \epsilon \cdot \begin{pmatrix} \cos^2( \epsilon / 4) \sin(\epsilon / 4) - \cos(\epsilon / 4) \sin(\epsilon / 4) \newline \sin^2(\epsilon / 4) \newline - \cos(\epsilon / 4) \sin^2(\epsilon / 4) - \cos^2(\epsilon / 4) \end{pmatrix}.\] Adding them all together gives that correspondingly we have \begin{multline} g(1;\epsilon) = \frac{\epsilon}{4} \cdot \begin{pmatrix} \sin(\epsilon / 4) \cdot \left[ 1 + \cos^2(\epsilon / 4) \right] \newline 1 - \cos(\epsilon / 4) + \sin^2(\epsilon / 4) \newline \cos(\epsilon / 4) - 1 - \cos(\epsilon / 4) \sin^2(\epsilon / 4) \end{pmatrix} \newline = \frac{\epsilon^2}{16} \begin{pmatrix} 1 \newline 0 \newline 0\end{pmatrix} + \frac{3\epsilon^3}{128}\begin{pmatrix} 0 \newline 1 \newline -1 \end{pmatrix} + O(\epsilon^4) . \notag\end{multline} Indeed, we see that from the antisymmetric contributions, the endpoint \(g(1;\epsilon)\) deviates from 0 by order \( \epsilon^2\).

## Torsion

This top order deviation, that is contributed from the antisymmetric part of the connection coefficients, is precisely the *torsion* of our connection.

To see what happens in general, we can remember that we only need to keep information up to order \(\epsilon^2\) in the integral for \(\hat{g}(1;\epsilon)\).
Since we know that \(A^\mu_\nu(0) = \mathrm{Id}\) and it solves a differential equation, we have that
\[ A^{\mu}_{\nu}(s;\epsilon) = \mathrm{Id} + O(\epsilon). \]
Similarly, we have that
\[ \hat{\Gamma}^\mu_{\nu\lambda}(\gamma(s)) = \hat{\Gamma}^\mu_{\nu\lambda}(p) + O(\epsilon). \]
So the leading order contribution is
\[ \hat{g}(1;\epsilon) = - \epsilon^2 \hat{\Gamma}^\mu_{\nu\lambda}(p) \int_0^1 \dot{\gamma}^\nu(s) \gamma^\lambda(s) ~\mathrm{d}s.\]
Using the antisymmetry we can write this as
\[ \hat{g}(1;\epsilon) = - \epsilon^2 \hat{\Gamma}^\mu_{\nu\lambda}(p) \cdot \frac12 \int_0^1 \dot{\gamma}^\nu(s) \gamma^\lambda(s) - \dot{\gamma}^\lambda(s) \gamma^\nu(s) ~\mathrm{d}s.\]
Now let us suppose the curve \(\gamma\) lies in a particular coordinate plane (so all but two of the coordinates of \(\gamma^\mu\) vanish identically).
From elementary calculus we see that in this case the integral \(\frac12 \int_0^1 \dot{\gamma}^\nu(s) \gamma^\lambda(s) - \dot{\gamma}^\lambda(s) \gamma^\nu(s) ~\mathrm{d}s\) is precisely the *signed area* of the figure enclosed by the curve \(\gamma\), measured relative to the coordinate system.
(Notice that this agrees with our computation in the example above.)

How does this relate to the idea that the torsion tensor is the difference \[ (X,Y) \mapsto \nabla_X Y - \nabla_YX - [X,Y] \] for vector fields on \(M\)? In our coordinate setting, the coordinate vector fields commute \([\partial_\mu,\partial_\nu] = 0\). So we have \[ \nabla_{\partial_\mu} \partial_\nu - \nabla_{\partial_\nu} \partial_\mu = 2 \hat{\Gamma}^\lambda_{\mu\nu} \partial_\lambda.\] One way to understand the terms in the torsion tensor is as follows:

Consider two vector fields \(X,Y\) on \(M\). Consider the path that starts from a point \(p\) and consists of moving an \(\epsilon\) distance along \(X\), then an \(\epsilon\) along \(Y\), followed by an \(\epsilon\) distance along \(-X\), and an \(\epsilon\) distance along \(-Y\). In general this path will not return exactly to \(p\). The leading order (as \(\epsilon \to 0\)) deviation of the end point of this path from \(p\) is precisely measured by the Lie bracket \([X,Y]\) (note that in the coordinate case, this path will close up exactly).

We can do to this path exactly the same thing we did to \(\gamma\) before, and look at the failure of the corresponding \(g\) to be a closed curve. However, as the curve generated from the vector fields \(X,Y\) already fail to close, we wish to isolate the two effects, one due to \(X,Y\) not commuting, and the other that is actually due to the connection. The torsion tensor precisely measures this difference, and captures the contribution, to the failure of \(g\) to close, due to twisting introduced by the connection (as opposed to the relative twisting of the vector fields \(X\) and \(Y\)).

Finally, as we saw in the computation in the example above, the torsion also contributes higher order effects. We can say a bit about them also. To see what the cubic contributions are, first we need to see what are the first order deviations from constants in \(A^\mu_{\nu}\) and \(\hat{\Gamma}^\mu_{\nu\lambda}\). In the former, using the ordinary differential equation satisfied by \(A^{\mu}_\nu\), we see that the order \(\epsilon\) contributions are given by \[ A^{\mu}_\nu(s;\epsilon) = \mathrm{Id}^\mu_\nu + \epsilon \Gamma^{\mu}_{\rho\nu}(p) \int_0^s \dot{\gamma}^\rho(\tau) ~\mathrm{d}\tau + O(\epsilon^2).\] The integral of course we can evaluate to get \[ A^{\mu}_\nu(s;\epsilon) = \mathrm{Id}^\mu_\nu + \epsilon \Gamma^{\mu}_{\rho\nu}(p) \gamma^\rho(s) + O(\epsilon^2).\] For the connection coefficients, we simply Taylor expand: \[ \Gamma^{\mu}_{\nu\lambda}(\gamma(s;\epsilon)) = \Gamma^{\mu}_{\nu\lambda}(p) + \epsilon\gamma^\rho(s) \partial_\rho \Gamma^{\mu}_{\nu\lambda}(p).\] And hence the cubic contribution from the antisymmetric part of the connection is \[ - \epsilon^3 \left[ \Gamma^{\mu}_{\tau\lambda}(p) \hat{\Gamma}^{\lambda}_{\rho\nu}(p) + \partial_\tau \hat{\Gamma}^{\mu}_{\rho\nu}(p)\right] \int_0^1 \gamma^\tau(s) \dot{\gamma}^\rho(s) \gamma^\nu(s) ~\mathrm{d}s .\]

## Curvature

What is the cubic contribution from the symmetric part of the connection? Based on what we have already computed before, we have that \[ \tilde{g}(1;\epsilon) = \frac12 \epsilon^3 \left[ \Gamma^\mu_{\tau\lambda}(p) \tilde{\Gamma}^\lambda_{\rho\nu}(p) + \partial_\tau \tilde{\Gamma}^\mu_{\rho\nu}(p)\right] \int_0^1 \dot{\gamma}^\tau(s) \gamma^\rho(s) \gamma^\nu(s) ~\mathrm{d}s + O(\epsilon^4).\] This formula is obviously very similar from the formula giving the \(\epsilon^3\) order contributions from the antisymmetric part, but looks slightly different. We can in fact improve the similarity: notice that \[ \int_0^1 \frac{\mathrm{d}}{\mathrm{d}s} [ \gamma^\tau(s) \gamma^\rho(s) \gamma^\nu(s) ]~\mathrm{d}s = 0, \] and hence \[ \int_0^1 \dot{\gamma}^\tau(s) \gamma^\rho(s) \gamma^\nu(s) ~\mathrm{d}s = - \int_0^1 \gamma^\tau(s) \dot{\gamma}^\rho(s) \gamma^\nu(s) ~\mathrm{d}s - \int_0^1 \gamma^\tau(s) \gamma^\rho(s) \dot{\gamma}^\nu(s) ~ \mathrm{d}s. \] Using the symmetry of \(\tilde{\Gamma}\) we see that in fact equivalently we can write \[ \tilde{g}(1;\epsilon) = - \epsilon^3 \left[ \Gamma^\mu_{\tau\lambda}(p) \tilde{\Gamma}^\lambda_{\rho\nu}(p) + \partial_\tau \tilde{\Gamma}^\mu_{\rho\nu}(p)\right] \int_0^1 \gamma^\tau(s) \dot{\gamma}^\rho(s) \gamma^\nu(s) ~\mathrm{d}s + O(\epsilon^4).\]

This contribution is the *curvature* of the connection.
Making use of the symmetry again we have
\[ \int_0^1 (\dot{\gamma}^\tau(s) \gamma^\rho(s)- \gamma^\tau(s) \dot{\gamma}^\rho(s) \gamma^\nu(s) ~\mathrm{d}s = - 2\int_0^1 \gamma^\tau(s) \dot{\gamma}^\rho(s) \gamma^\nu(s) ~\mathrm{d}s - \int_0^1 \gamma^\tau(s) \gamma^\rho(s) \dot{\gamma}^\nu(s) ~ \mathrm{d}s, \]
and so
\[ \tilde{g}(1;\epsilon) = \frac13 \epsilon^3 \left[ \Gamma^\mu_{\tau\lambda}(p) \tilde{\Gamma}^\lambda_{\rho\nu}(p) + \partial_\tau \tilde{\Gamma}^\mu_{\rho\nu}(p) - \Gamma^\mu_{\rho\lambda}(p) \tilde{\Gamma}^\lambda_{\tau\nu}(p) - \partial_\rho \tilde{\Gamma}^\mu_{\tau\nu}(p) \right] \int_0^1 \gamma^\tau(s) \dot{\gamma}^\rho(s) \gamma^\nu(s) ~\mathrm{d}s + O(\epsilon^4).\]
In the case that \(\nabla\) is the Levi-Civita connection of some metric (and hence is torsion-free), the term in the brackets should be recognizable as the formula for the Riemann curvature tensor.