I have a few questions. Any thoughts to any one of them will be appreciated.
Suppose $S\subseteq M$ is an embedded submanifold of $M$. There is a convenient characterization of the tangent spaces of $S$, given by $$T_{p}S = \{ v\in T_{p}M \;|\; v(f) = 0 \text{ for all } f\in C^{\infty}(M) \text{ constant on } S. \}. $$ Under this identification, $T_{p}S$ is a subspace of $T_{p}M$. Furthermore, if $\iota: S\hookrightarrow M$ is the inclusion map, we know that the pushout $\iota_{*}:T_{p}S\rightarrow T_{p}M$ is also the inclusion map.
$1)$ Does there exist as nice of a characterization of the cotangent space?
$2)$ I know the pullback $\iota^{*}: T_{p}^{*}M\rightarrow T_{p}^{*}S$ takes things in the "opposite" direction, so I always assumed it would just be a linear projection map... but which one? There can be infinitely many projections onto a subspace.
To show why this is such a confusing matter, I thought to look at an example.
Let $M = \mathbb{R}^{2}$ and $S = \mathbb{R}\times\{0\}$.
When we use the chart $(\mathbb{R}^{2}, \text{id})$ with coordinates denoted as $(x, y)$, we have $$ T_{0}M = \text{Span}(\partial_{x}, \partial_{y}), \quad T_{0}S = \text{Span}(\partial_{x}). $$ Now I assume the reasonable identification of $T_{0}^{*}S$ would be found by taking the basis vector $\partial_{x}$ of $T_{0}S$ and then taking its dual $\text{d}x$. This way I get $$ T_{0}^{*}M = \text{Span}(\text{d} x, \text{d} y), \quad T_{0}^{*}S = \text{Span}(\text{d} x). $$ However, this identification of $T_{p}^{*}S$ doesn't seem consistent or unique if you use another basis.
Consider the chart $(\mathbb{R}^{2}, f)$ where $f(x, y) = (u, v) = (x - y, y)$ and $f^{-1}(u, v) = (x, y) = (u+v, v)$, so the new coordinates $(u, v)$ are skewed. I calculate that \begin{align*} \begin{cases} \partial_{u} = \partial_{x}, \\ \partial_{v} = \partial_{x} + \partial_{y} , \end{cases} \quad\text{ and }\quad \begin{cases} \text{d}u = \text{d}x - \text{d}y, \\ \text{d}v = \text{d}y. \end{cases} \end{align*} This gives me \begin{align*} & T_{0}M = \text{Span}(\partial_{u}, \partial_{v}), \quad T_{0}S = \text{Span}(\partial_{u}), \\ & T_{0}^{*}M = \text{Span}(\text{d}u, \text{d}v), \quad T_{0}^{*}S = \text{Span}(\text{d}u + \text{d}v). \end{align*} However, if we take the basis vector $\partial_{u}$ of $T_{0}S$ and then take its dual vector $\text{d}u$, we should obtain $T_{0}^{*}S = \text{Span}(\text{d}u)$, which does not match the above.
The above example makes me think that there is no canonical (i.e. basis independent) way to identify $T_{p}^{*}S$ as a subspace of $T_{p}^{*}M$. Am I correct in thinking this?
This is essentially a question about linear algebra---the fact that the objects here arise in a differential-geometric setting plays no real role, but for convenience I'll still using the language of that setting.
Given a smooth embedding $\iota : S \hookrightarrow M$ and a point $p \in S$, there is no preferred inclusion $T_p^* S \to T_p^* M$. In particular:
This is essentially for the reason you say: Given an element $\alpha \in T_p^* S$, that is, a linear map $T_p S \to \Bbb R$, there are many ways to extend it to a linear map $T_p M \to \Bbb R$, and without more data there is no way to choose one canonically; $\alpha$ only contains information about directions in $T_p S$.
On the other hand, like you say, the inclusion $\iota$ induces a map $$\iota_p^* : T_p^* M \twoheadrightarrow T_p^* S$$ (the double arrow, $\twoheadrightarrow$, denotes that the map is surjective).
There is an inclusion in the cotangent picture, however. The map dual to the canonical quotient $$\pi_p : T_p M \twoheadrightarrow T_p M / T_p S$$ is the inclusion $$\pi_p^* : (T_p M / T_p S)^* \hookrightarrow T_p^* M .$$ By definition, $\pi_p^* : \beta \mapsto \beta \circ \pi_p$. Just as we can identify the elements in the image $T_p\iota(T_p S)$ precisely with the elements of $T_p M$ that are tangent to $\iota(S)$, we can identify the elements in the image $\pi_p^*(T_p M / T_p S)$ precisely with the elements of $T_p M^*$ that annihilate $T_p S$; we denote the set of such elements by $(T_p S)^{\perp}$.
All of the above maps fit together in a compact picture, by the way: $T_p \iota$ and $\pi_p$ give a short exact sequence, $$0 \hookrightarrow T_p S \stackrel{T_p \iota} \hookrightarrow T_p M \stackrel{\pi_p}{\twoheadrightarrow} T_p M / T_p S \to 0 .$$ (Here, s.e.s. just means that the image of every map in the sequence coincides with the kernel of the following map.) Dualizing gives a dual s.e.s., $$0 \hookrightarrow (T_p M / T_p S)^* \stackrel{\pi_p^*} \hookrightarrow T_p^* M \stackrel{\iota_p^*}{\twoheadrightarrow} T_p^* S \to 0 .$$
In the presence of a Riemannian metric $g$, however, we can identify $T_p M \leftrightarrow T_p^* M$ via the index-lowering operator, $\,\cdot\,^{\flat} : X \mapsto g_p(X, \,\cdot\,)$, and its inverse, the index-raising operator $\,\cdot\,^{\sharp}$. Thus, $g$ gives us what we didn't have before: a preferred injective map $T_p^* S \hookrightarrow T_p^* M$, namely, $$T_p^* S \stackrel{\sharp}{\to} T_p S \stackrel{T_p \iota}{\hookrightarrow} T_p M \stackrel{\flat}{\to} T_p^* M .$$ It's perhaps easier to describe the dual map $T_p M \twoheadrightarrow T_p S$, $$T_p M \stackrel{\flat}{\to} T_p^* M \stackrel{\iota^*}{\twoheadrightarrow} T_p^* S \stackrel{\sharp}{\to} T_p S$$ ---this is nothing more than the orthogonal projection from $T_p M$ to $T_p S$ (with respect to $g$).