I recently answered a problem on MathOverflow. The discovery of the answer has some similarities to "research activities", and this post is to document the experience for interested students to see an example of mathematical problem solving.
Before it starts
The first part of the "process" was one where I wasn't involved in, but is something typical in mathematical research.
The person who asked the question encountered the problem (which I will describe below) in the course of their research in geometric analysis. What makes this problem an interesting research problem as well as a fun problem solving exercise is that the question asker did a lot of homework (the same sort of homework that one should do in a research setting). This included:
- Firstly, and most importantly: the problem was formulated as a precise conjecture.
In research it is not profitable to say "I want to study (insert general topic here)." A much better starting point is "I wonder if (insert precise statement) is true?"
Usually such a statement is derived from inductive reasoning (by computing some examples and realizing their commonalities, you are led to formulate a conjecture that the observed phenomenon is general). - Numerical tests: nowadays computers are powerful. A lot of mathematical research benefits from computer aid. You can go all in and try AI-powered literature search and reasoning models, or you can stick to the simple gambit of using the computer to calculate lots of numerical examples. This is also helpful in the "conjecture forming phase" of research.
- Solve simple cases: When one has a conjecture, one can look at special cases of conjecture and test them explicitly. This may help find a counterexample. A typical example that happens a lot in analysis is:
- Conjecture: X is the only solution to problem Y
- Simple case: There are no other solutions to problem Y "near" X.
By localizing the problem to small perturbations of X, you can take advantage of the smallness. A example from calculus is that proving X is a global minimum is harder than proving X is the local minimum of a function. The latter only requires knowing that the function has vanishing derivative at X, and that the second derivative has the right sign.
This latter calculus example also showcases an important technique: linearization. When dealing with a nonlinear problem, if one were willing to take a step back and consider the "localized" problem, then we can often replace the nonlinear problem with a linear problem that is much more readily solvable.
- Solve similar problems: Perhaps the conjecture is hard because some of the terms in the expression are hard to analyze. Can we analyze the similar problem where those terms are not present? For this one needs a leap of faith that the "simplified expression" gives an accurate representation of the original problem. On the other hand, even without a full justification of the connection between the two problems, this procedure is often valuable. If the "similar problem" does in fact satisfy the conjecture, you've effectively isolated the difficulty to the parts that were changed. If the "similar problem" does not satisfy the conjecture, then you realize that for your conjecture to be true, you'd better understand what makes the two seemingly similar problems in fact different. In either case you gained some knowledge.
- See what your conjecture implies: another good sanity check type work to do is to assume that your conjecture is true, and see what would be some consequence. In the linked post, the original question is about an integral inequality in geometry. The veracity of the conjecture would imply that a certain geometric "flow" is one that "shortens length".
Frequently, these sorts of consequences will give more easily testable examples. In the case that your conjecture is in fact false, you can hope to find counterexamples by looking these derived consequences.
Seeing the problem
The problem on MathOverflow is a very simple conjecture:
Given $r$ a $2\pi$ periodic smooth function with values in $(0,\infty)$, let $\kappa(\theta)$ be the signed curvature at the parameter $\theta$ when we interpret $(r(\theta),\theta)$ as a polar curve, claim: \[ 2\pi \int_0^{2\pi} \kappa(\theta) r^2(\theta) ~d\theta \overset{?}{\geq} \int_0^{2\pi} \kappa(\theta)~d\theta \cdot \int_0^{2\pi} r^2(\theta) ~d\theta \]
The curvature $\kappa$ has an explicit formula in terms of $r$: \[ \kappa = \frac{r^2 + 2 (r')^2 - r r'' }{(r^2 + (r')^2)^{3/2}}. \]
Step 1: Understand the problem
Now, if we decouple $\kappa$ and $r$, the conjecture is certainly false, as we just need where $\kappa$ is large to be where $r$ is small, and vice versa, to force the left hand side much smaller than the right hand side. So this conjecture hinges on the relation between $\kappa$ and $r$.
This being a geometric problem, I want to understand it "geometrically", and to me (personally) this means rewriting the problem in arc-length parametrization, since I feel that I know how curvature behaves better in arc-length parametrization.
When doing so, I made a discovery:
The integral $\int_0^{2\pi} \kappa(\theta) r^2(\theta) ~d\theta$ for a closed polar curve is equal to its length.
I am sure this formula is probably found as an exercise in some intro differential geometry textbook somewhere, but I didn't know that. Which to me makes it an exciting discovery. Additionally, by recasting an integral as a geometric quantity, this makes it much more likely we can gain some progress using "intuitive" reasoning rather than hard computations.
Step 2: Seeing similar problems
Now that we saw the LHS is $2\pi$ times the length of the circumference of the figure, can we say something similar about the RHS? The integral $\int_0^{2\pi} r^2(\theta) ~d\theta$ is well-known as twice the area of the area enclosed by a polar graph, so that the conjectured inequality is of the form \[ \pi \cdot \text{circumference} \overset{?}{\geq} \int_0^{2\pi} \kappa(\theta)~d\theta \cdot \text{area}. \]
Now dimensionally speaking this seems to be ok. If you take the figure and dilate it, the circumference goes up linearly as the dilation factor, the area goes up quadratically, and the curvature behaves as one over length, so goes done linearly. The angles are preserved in dilation, so the scaling on the two sides agree.
Are there similar statements to these inequality that may help? It turns out that there are two.
Isoperimetric inequality
The isoperimetric inequality is a well-known result in plane geometry. Qualitatively it states that given a length of string, the maximum area that it can enclose is if you lay it down as a circle. Quantitatively it is the statement that for planar figures \[ \text{circumference} \geq \frac{4\pi}{\text{circumference}} \cdot \text{area} \] So the conjecture being asked about would be true if we can prove that \[ \frac{4\pi^2}{\text{circumference}} \overset{?}{\geq} \int_0^{2\pi} \kappa(\theta)~d\theta.\]
Note that this final expression is in fact true for the circle: a circle of radius $R$ centered at origin has curvature $1/R$, so the RHS is exactly $2\pi / R$.
Gauss-Bonnet
Another formally similar statement is the Gauss-Bonnet Theorem. When applied to regions in the plane, it states that, with $s$ the arc-length parameter of the boundary curve, \[ \int \kappa(s) ~ds = 2\pi. \] Suppose we were to want to leverage Gauss-Bonnet to say something about the integral of $\kappa$ with respect to the angular parametrization $\theta$, it would be useful to rewrite Gauss-Bonnet as \[ \int_0^{2\pi} \kappa(\theta) \sqrt{r(\theta)^2 + (r'(\theta))^2} ~d\theta = 2\pi.\] So were our idea of using the isoperimetric inequality to be valid, we need to be able to say something interesting about the Jacobian factor that shows up.
Problem!
Supposing now that our curve is convex. Then $\kappa \geq 0$ everywhere. Then we can write \[ \int_0^{2\pi} \kappa ~d\theta \leq \int_0^{2\pi} \sqrt{r^2 + (r')^2} ~d\theta \cdot \sup \frac{1}{\sqrt{r^2 + (r')^2}} = \frac{2\pi}{\inf \sqrt{r^2 + (r')^2}} \] The quantity on the right is equal to \[ \frac{2\pi}{\text{minimum value of }r(\theta)}. \] Unfortunately this is larger than $4\pi^2 / \text{circumference}$ so this approach is no-go.
Step 3: Reset
There are two different reasons why we shouldn't yet give-up:
- The estimate that led to the factor of $2\pi / \text{minimum value of }r(\theta)$ used Holder's inequality in a naive way, and ignored the relation between the curvature $\kappa$ and $r$. It is possible that exploiting this we can gain more information.
- Isoperimetric is sharp for circles, but there is some room when the figures are not circles, so maybe a bit of this extra room can be used to compensate.
On the other hand, given that we already have nice geometric meanings of three out of the four integrals involved, our focus should be on figuring out what we can do with the $\int \kappa ~d\theta$ integral. Taking some inspiration from the proof of Gauss-Bonnet, we can see if a similar argument can be made.
One way to prove Gauss-Bonnet is to start with the explicit expression of $\kappa$, and look at the integral \[ \int_0^{2\pi} \kappa(\theta) \sqrt{r(\theta)^2 + (r'(\theta))^2} ~d\theta = \int_0^{2\pi} 1 + \frac{(r')^2 - r r''}{r^2 + (r')^2} ~d\theta \] The second integrand turns out to be a total derivative, so integrates away, and the first integrand can be evaluated easily.
Without the Jacobian factor, the computation is not as clean. But it turns out we can still integrate away the second derivative term $r''$. The result is \[ \int_0^{2\pi} \kappa(\theta) ~d\theta = \int_0^{2\pi} \frac{r^2 - (r')^2}{r^2 \sqrt{r^2 + (r')^2}} ~d\theta. \]
Step 4: Wait a second...
With this new formulation, we begin to realize that the conjecture may in fact be false: The curvature integral has a positive term (with numerator $r^2$) and a negative term (with numerator $(r')^2$).
- We can make the curvature integral large if we can ensure:
- For a large portion of angles, we have $0 \leq |r'| \ll r \ll 1$.
- Make it so that the angles at which $|r'|$ is large, or that $r$ is large, form a small set.
- So the question is: would such a shape necessarily have small total area? Or can we arrange it also with large total area.
The fact that for large portion of angles we have the radius small, means that the area contribution there must be small. So the question boils down to whether, with a small set of angles to play with, we can generate a large area.
The good thing is now that everything in sight (the total circumference integral, the area integral, and the curvature integral) have been replaced by integral expressions involving the function $r$ and its derivative $r'$. So we can test this result by looking at the case where $r$ is given by a piecewise linear function.
(The original numerical tests by the question asker used $r$ as given by a random trigonometric polynomial, to ensure that it is smooth. This is because the original formula for curvature involved second derivatives which will cause an issue when $r'$ has jump discontinuities.)
Our idea about making the curvature integral large leads us to consider $r(\theta)$ to be constant $\epsilon$, except for some really tall (linear) spikes. Lo-and-behold, evaluating the integrals we see that by making the spikes sufficiently tall and skinny, we can ensure not only a lower bound on the curvature integral, but also that the area of the figure enclosed grows quadratically in the height of the spike.
Step 5: Huh?
This last statement should give you pause. If we have tall linear spikes, shouldn't the enclosed area grow only linearly in the spike height, and not quadratically?
The reason is that the area integral in polar form is $\int r^2~d\theta$; compare this to the area under a curve in Cartesian form $\int f ~dx$. The building block of area integration in polar form are similar pizza wedges, whose area increase quadratically as the height increase. The building block of area integration in Cartesian form are rectangles, with the width and height decoupled.
Okay, so with this sanity check out of the way, can we get a better understanding of what is going on with this specific example? Our computations involved some integration by parts tricks to get rid of the second derivative $r''$, but can we get the same result without this trick? And thereby reason more geometrically?
The idea is that the proof of Gauss-Bonnet even for non-smooth figures has a very clear geometric interpretation: where there is a corner you just add the angle deficit coming from the corner as a discrete value. If we can reproduce this for the $\int \kappa ~d\theta$ integral, maybe we can understand better what is going on and come up with more intuitive counterexamples.
This is one of the interesting parts of doing resesarch. Once you get a result (either positive or negative), the first question to ask is "why does the result work?" Sometimes the reason is obvious: you just followed your nose all the way from the beginning to the end of the problem. But sometimes the reason is more subtle, since you took a number of detours in the process (like I did for this problem). To think like a mathematician is to be dissatisfied with just knowing the result. Someone once told me that a mathematical result should not be considered "proven" until we can find an argument that make the result "obviously true".
This is what I did in the end of the post over at MathOverflow.
The kernel of truth in this whole thing turns out to be the following observation: for any positive angle, \[ \sin(\theta) < \theta \] So when evaluating the curvature integral $\int \kappa ~d\theta$, at any parameter $\theta$ where the osculating circle at $r(\theta)$ does not have center at the origin of the plane, the infinitesimal element $|\kappa ~d\theta|$ will be less than $|\kappa ~ds / r|$.
So in particular, turns that generate negative curvature will count a little bit less than turns that generate positive curvature.
Once we realize this "feature" (in a quantitative way), it is easy to construct a geometric figure using only circular arcs and straight lines and corners and obtain an "intuitive" counterexample.