Suppose $a,b$ are constants. $x,y$ are variables.
Simplify the maximization problem of this kind:
$\arg\max_y[(\arg\max_xf(x,y,a))\cdot(\arg\max_xf(x,y,b))]$.
Motivations:
- The following maximization problem with two "max" operator can be easily reduced: $\max_x\max_yf(x,y)=\max_{x,y}f(x,y)$.
But, wierd enough, the counterpart using "argmax" cannot be easily reduced. That is, $\arg\max_x\arg\max_yf(x,y)$ cannot be similarly reduced.
The functions involving $\arg\max$ seems to be always "misbehaved". Observe that we don't even have $\arg\max_x\arg\max_yf=\arg\max_y\arg\max_xf$.
- The reduction from two "argmaxs" to one "argmax" is very important practically because it saves computational power.
By definition argmax are the points in the domain of a function at which its values are maximized. In particular, $$ \arg\max_xf(x,y) $$ is the set of those points at which $f(\,.\,,y)$ is maximized. This set depends on $y\,.$ If that function has only a single maximum, say at $x_m$, we have $$ x_m(y)=\arg\max_xf(x,y)\, $$ and are able to define $$ \arg\max_y\arg\max_xf(x,y) $$ which is the set of points at which $x_m(y)$ is maximized. This is obviously something completely different than $$ \arg\max_{x,y}f(x,y) $$ (which is a set in $\mathbb R^2)\,.$ This is well defined. How to caculate it efficiently with a computer is clearly not achieved by calculating the univariate argmaxes as shown above.