I've been going through notes on Morse theory and Handlebody theory and I've been having some trouble with the definition of the Hessian provided. The notes are on pages 3-4 here http://people.math.gatech.edu/~etnyre/preprints/papers/PCMI-HandlebodyA-B.pdf
For the setup, we have a manifold $W$, a point $p$ and a map $f : W \to \mathbb{R}$ that is Morse. For my current definition of being Morse, it means that at any critical point, $df$ is transverse to the zero section $Z$ at $p$. Considering $df$ as a map from $W \to T^*W$, we can consider the derivative of this map at the point $p$, i.e. $d_p(df):T_pW \to T_{df(p)}(T^*W))$. Since $p$ is a critical point, $df(p) = (p,0)$. So $d_p(df): T_pW \to T_{(p,0)}(T^*W)$. We also have a splitting $T_{(p,0)}(T^* W) \cong T_{(p,0)}(Z) \oplus T_p^*M$ which I am also comfortable with. This gives us a projection map $C_p: T_{(p,0)}(T^* W) \to T_p^*M$. Finally, the Hessian at the point $p$ is defined as $(d^2f)_p : T_pW \times T_pW \to \mathbb{R}$ by $(v,w) \mapsto C_p(d_p(df(v)))(w)$. I sort of understand what's going on but not really, and I'm not able to show that this is symmetric (though the bilinearity is obvious I think).
I want to show that $C_p(d_p(df(v)))(w) = C_p(d_p(df(w)))(v)$.
My first thought is to use local coordinates. Let $U$ be a coordinate patch for $W$, and so we have $f: U \to \mathbb{R}$. In local coordinates, $df(p) = (p, \sum_i \frac{\partial f}{\partial x^i} dx^i)$, as a map from $W \to T^* W$. Now I'm going to try and define $d_p(df):T_pW \to T_{(p,0)} (T^* W)$. This just differentiates with respect to coordinates on $W$, so I get $(\sum_i \frac{\partial p}{\partial x^i} dx^i, \sum_j \sum_i \frac{\partial^2}{\partial x^i \partial x^j} dx^i dx^j)$. This is where I start to get really confused. Is there supposed to be a wedge between $dx^i$ and $dx^j$? I feel like there shouldn't, but isn't that expression meaningless without a wedge? Is it supposed to be $dx^i \otimes dx^j$? Anyway, moving onto projecting using $C_p$ gives me $C_p(d_p)(df) = \sum_j \sum_i \frac{\partial^2}{\partial x^i \partial x^j} dx^i dx^j$ which indeed does look like the Hessian in local coordinates. I know the definition wants me to contract this with $v = \sum v^i \frac{\partial}{\partial x^i}$ to get a covector and then apply to $w = \sum w^i \frac{\partial}{\partial x^i}$. Either way you get $\sum_j \sum_i \frac{\partial^2}{\partial x^i \partial x^j} v^i w^j$. I feel like this only works if I had $dx^i \otimes dx^j$ in my computation. Can someone help clear this up?
I think you're getting understandably confused by the notation for the coordinates. Let's use $x^i$ for the coordinates on W and $v^j$ for the vertical coordinates on $T^\ast W$ (you can think of $v^j$ as the coefficient on $dx^j$, but let's not write $dx^j$ yet). We're identifying $T^\ast U$ with $U \times \mathbb{R}^n$. The vertical part (the $\mathbb{R}^n$ part) of $df$ is expressed by $v^j = \frac{df}{dx^j}$ (and the horizontal part is the identity map).
Now, $d_p(df): T_p W \to T_{(p,0)}(T^\ast W) \cong T_p W \oplus T_p^\ast W$ is a linear map between vector spaces. Let's make it clear that we're using the basis $\{ \dots, \frac{\partial}{\partial x^i}, \dots \}$ for $T_p W$ and the basis $\{\dots, dx^j, \dots \}$ for $T_p^\ast W$. $d_p(df)$ maps the basis vector $\frac{\partial}{\partial x^i}$ to $\frac{\partial}{\partial x^i} \oplus \sum_j \frac{\partial v^j}{\partial x^i} dx^j$. After projecting onto the second factor, we obtain the linear map $T_p W \to T_p^\ast W$ that maps $\frac{\partial}{\partial x^i}$ to $\sum_j \frac{\partial v^j}{\partial x^i} dx^j = \sum_j \frac{\partial^2 f}{\partial x^i \partial x^j} dx^j$. If you like, you can view this as an element of $T_p^\ast W \otimes T_p^\ast W$ and write it as $\frac{\partial^2 f}{\partial x^i \partial x^j} dx^i \otimes dx^j$, as you suggested.
Note that under the identification, $T_{(p,0)}(T^\ast W) \cong T_p W \oplus T_p^\ast W$, we are identifying the vertical tangent vector $\frac{\partial}{\partial v_j}$ with $dx^j$. Also, I am writing $T_p W$ instead of $T_{(p,0)} Z$ since we can identify the zero section $Z$ with $W$.
By the way, if I'm understanding this correctly, I think the notation should be $(d_p(df))(v)$, not $d_p(df(v))$.