In my book of classical mechanics (Mathematical methods of classical mechanics by V.I. Arnold), the Hamiltonian is introduced in this way (my translation):
Let us consider the system of equations $\dot p = \partial L /\partial \dot q$ ($p\in \mathbb R^n$, $q\in \mathbb R ^n$, the second member is the gradient of the Lagrangian with respect to $\dot q$), defined by a Lagrangian that we will suppose convex with respect to the second argument $\dot q$.
[...]
By definition, the Legendre transform in $\dot q$ of $L(q,\dot q ,t)$ is a function $H(p)=p\dot q-L(\dot q)$, where $\dot q$ is given by the relation: $$p=\dfrac{\partial L}{\partial \dot q}.$$
Now, my definition of Legendre transform of a function $f:\mathbb R ^n \to \mathbb R $ is: $$g(p)=\sup _{x\in \mathbb R^n} (\langle p,x\rangle-f(x)).$$ I can see that the quoted definition coincides with mine if for example, we suppose that $f$ is a quadratic form $$f(x)=x^T A x,$$ for a positive definite symmetric matrix. In the general case of a convex $f$ (that, to clarify, here means “definite positive Hessian matrix”), however, I don't see how are we granted that:
- The maximum is attained at a point $x\in \mathbb R ^n$ (we should at least require that $f$ is coercive right? Counterexample: $f(x)=-\ln x$)
- The equation $p=\partial f / \partial x$ has a unique solution.
What are sufficient (or maybe necessary and sufficient) conditions for the above formulas to properly define a function that coincides with the Legendre transform of $L$?
I've made some progress:
Suppose that $f:\mathbb R ^n \to \mathbb R$ has a positive definite hessian matrix $f''(x)$ for all $x$. Also suppose that $$\lim _{|x|\to \infty } \frac{f(x)}{|x|} = \infty .$$ Then the transform $f^*$ exists and is given by $$f^*(p)=\left\langle p,\xi (p)\right\rangle-f(\xi (p)),$$ where $\xi$ is the inverse of $f'$.
In fact, if the limit holds, one can easily see that $G_p (x) = f(x)-\left\langle x,p\right\rangle$ is coercitive and admits a minimum in $\mathbb R ^n$, that corresponds to $-f^*(p)$. At this point, the derivative vanishes, so $f'(x)=p$ (this also proves that $f'$ is surjective). Finally, since $f''>0$, $f'$ is injective and has an inverse $\xi$ and the Legendre transform is as said above.
It is clear from your examples that the supremum might not exist.
However, the solution to the equation $\frac{\partial L}{\partial \dot q} = p$, if it exists, is unique. Because the Hessian of $L(x,y)$ with respect to $y$ is positive definite, this means that $f:\mathbb R \to \mathbb R$, $t\mapsto L(x,y_0+t(y_1-y_0))$ satisfies $f''>0$ whenever $y_0 \ne y_1$. This means that $f'(y_0) \ne f'(y_1)$, and $f'(y) = (y_1-y_0) \cdot \frac{\partial L}{\partial y}$, hence $\frac{\partial L}{\partial y}(x,y_0) \ne \frac{\partial L}{\partial y}(x,y_1)$.
You have shown coercivity implies existence. What remains is the converse.
Since the Hessian is positive definite, by the implicit function theorem, the map $y \mapsto \frac{\partial L}{\partial x}(x,y)$ is locally invertible with a continuous local inverse. Since the map is injective, it is a continuous map from $\mathbb R^n$ onto its image.
For now, write $G(y) = \frac{\partial L}{\partial x}(x,y)$
Pick $M > 0$. Then for every $z \in \mathbb R^n$ with $|z|\le M$, there exists a unique $y_z$ such that $G(y_z) = z$. The map $z \mapsto y_z$ is well defined and continuous. Hence $\sup_{|z|=1} |y_z| = N$ exists. Now $G^{-1}(\{|z|\le M\}$ is compact and hence bounded. Hence $G^{-1}(\{|z|>M\}$ is unbounded. Therefore if $|y|>N$, then $|G(y)| \ge M$.
Now suppose that $|y| > N$. Create a path along the ODE $\eta(0) = y$, $\eta'(t) = G(\eta(t))/|G(\eta(t))|$. Let $T = \inf\{t:|G(\eta(t))| = M$ (and its OK if $T = \infty\}$, but as it happens it won't). Then $T \ge |y|-N$. Then it can be seen that $L(x,y) \ge L(x,\eta(T)) + T M \ge T M + \inf_{\xi} L(x,\xi)$. Hence if $|y|$ is large enough, the $L(x,y) \ge \frac12M |y|$. So $L(x,y)$ is coercive in $y$.
Kind of a complicated argument. Maybe there is something simpler.