Consider a function $L: \mathbb{R}^n \times H \to \mathbb{R}$ with minimal assumptions on L. Here, H is some Hilbert space and we may assume it’s elements are smooth functions. So in one sense $L$ may be thought of as a functional on $H$ indexed by a real vector. I am wondering if the global maximum of this object can be found by sequentially optimizing its arguments. If f were finite dimensional this would be the case with minimal assumptions on L, but with f being infinite dimensional it is no longer obvious to me.
If we define $\hat{f}(x) \equiv \max_f L(x, f)$ for a given $x$, is it true that the global maximum of L is given by $\max_x L(x, \hat{f}(x))$? I have been stuck with trying to show this so any pointer would be appreciated.
I may be missing some subtlety here...
I suppose $\hat{f}$ is well-defined (in that $\max_{f \in H} L(x, f)$ is attained at some $f$ for each $x$).
Suppose $L$ attains a maximum at $(x^*, f^*)$.
We have $L(x^*, f^*) \ge L(x^*, \hat{f}(x^*))$ because $(x^*, f^*)$ is a maximizer. We also have $L(x^*, f^*) \le L(x^*, \hat{f}(x^*))$ by the definition of $\hat{f}(x^*)$. So $$L(x^*, f^*) = L(x^*, \hat{f}(x^*)) \tag{1}.$$
Now suppose $x_0$ is the maximizer of $\max_x L(x, \hat{f}(x))$. We have $L(x^*, f^*) \ge L(x_0, \hat{f}(x_0))$ because $(x^*, f^*)$ is a maximizer of $L$. We also have $L(x_0, \hat{f}(x_0)) \ge L(x^*, \hat{f}(x^*)) = L(x^*, f^*)$ by the definition of $x_0$ and by equation (1) above. Thus $$L(x^*, f^*) = L(x_0, \hat{f}(x_0)) = \max_x L(x, \hat{f}(x)).$$
I'm not certain that the above holds, but hopefully this can help someone clarify where things break.