Let $S = \{(x,y) \in \mathbb{R}^n \times \mathbb{R}^n:|x|=|y|=1\}$ and let $F: S \to \mathbb{R}$, $F(x,y) = x \cdot y$. Use Lagrange multiplier theorem to find $\sup\limits_{(x,y) \in S} F(x,y)$.
Now using the multiple constraint Lagrange multiplier theorem, this should follow rather easily. My question is, can I avoid it and just use the single constraint Lagrange multiplier theorem?
Since $S$ is closed and bounded, and $F$ is continuous, it achieves a max on $S$, say at $(x_0,y_0)$. Now let $S' = \{x \in \mathbb{R}: |x|=1\}$ and define $f: \mathbb{R}^n \to \mathbb{R}$ by $f(x) = x \cdot y_0$. Also define $g:\mathbb{R}^n \to \mathbb{R}$ by $g(x) = |x|^2-1$. Now by the Lagrange multiplier theorem, I have get that $\nabla f(x_0) = \lambda \nabla g(x_o)$, which implies that
$$y_j^0 = 2 \lambda x_j^0,$$
which would imply that $\lambda = \frac{1}{2}$, which would further imply that $x_0 = y_0$. So the maximum of $F$ would be $1$.
Is this a valid approach? Is it obvious that the maximum of $f$ on $S'$ (also closed and bounded) will be at $x_0$?
It is of course obvious that the maximum of $F$ on $S$ is $1$, and that the maximal value is accepted on all diagonal pairs $(z,z)\in S$. This means that we do not have a discrete set of conditionally stationary points $\bigl(x^{(j)},y^{(j)}\bigr)$ $(1\leq j\leq m)$, but a continuum of these.
When we feign not to see this a priori then we have $2$ constraints in a $2n$-dimensional situation, i.e., we should set up the "Lagrangian" $$\Phi(x,y,\lambda,\mu):=F(x,y)-\lambda\bigl(|x|^2-1\bigr)-\mu\bigl(|y|^2-1\bigr)\ ,$$ then proceed according to the book: $$\Phi_{.x_i}=y_i-2\lambda x_i\quad(1\leq i\leq n),\qquad \Phi_{.y_k}=x_k-2\lambda y_k\quad(1\leq k\leq n)\ .$$ This leads two times to $x\parallel y$, i.e., $x=y$ or $x=-y$.
In your approach you kept $y_0$ fixed. Normally you would obtain a few conditionally stationary points $x^{(j)}$, but these would belong to the chosen $y_0$. You would then compute the maximal $F$-value $F(x^{(j)},y_0)$ and obtain an auxiliary function $y_0\mapsto \phi(y_0)$, which you then would have to maximize again with respect to $y_0$.