I have the following optimization \begin{align} \max_{\|x\|^2\le1} \|Lx - y\| \end{align} where $L$ is a lower triangular and $y$ is a given vector. Does it admit a closed-form solution?
I am interested in the general case where $L$ may be not invertible and even not square, but any progress under some assumptions will be muchnappreciated.
The motivation is to understand the standard least squares method but as a game from the "adversarial" point of view: $y$ is a measurement sequence and $x$ is the set of possible states from $y = Lx + n$.
We can consider the following equivalent objective:
$$\|Lx - y\|^2=x^T\color{blue}{L^TL}x-2y^TLx+y^Ty.$$
As $L^TL$ is a symmetric (positive semidefinite) matrix, it admits the spectral decomposition: $$\color{blue}{L^TL = Q^T \Lambda Q }$$ for some orthogonal matrix $Q$ with $Q^TQ=I$ and diagonal matrix $\Lambda \, (\ge 0)$. Then, defining $\color{blue}{z=Qx}$ and $\color{blue}{b:=2QL^Ty}$, the problem is equivalent to:
$$\color{blue}{\max_{\|z\|^2=1} \quad z^T\Lambda z-b^Tz+y^Ty }.$$
Note that here we consider the equivalent constraint $$\|x\|^2=x^Tx=x^TIx=x^TQ^TQx=z^Tz=\|z\|^2=1.$$ As the problem is convex maximization (maximization of the convex quadratic function $z^T\Lambda z-b^Tz+y^Ty$ over the convex set $\|z\|^2 \le 1$), there is an optimal solution on the boundary (at some extreme point), so the constraint $\|z\|^2 \le 1$ can be replaced with $\|z\|^2 = 1$.
For the $\color{blue}{\text{new problem}}$, from the KKT conditions, the maximizer is among normalized vectors $z$ (with $\|z\|^2=1$) that satisfy the following system:
$$\Lambda z-b+2\mu z=0 \equiv \\ \color{blue}{(\Lambda+2\mu I)z=b}$$
for some $\mu \in \mathbb R$ (note that as we consider the equality constraint $\|z\|^2 = 1$, $\mu$ can be any number in $\mathbb R$). Now we should find all the feasible solutions of this system (feasible KKT points) by noting that $\color{blue}{\Lambda+2\mu I}$ is a diagonal matrix, which simplifies the search procedure. All the feasible KKT points can be obtained in the following two steps:
If some element $i$ of $b$ is $0$, then either $z_i$ is $0$ or $-2\mu$ is equal to the $i$th element of $\Lambda$, i.e., $\mu=-\frac{\lambda_i}{2}$. The first scenario is investigated in Step 2. Under the second scenario, $z_j=\frac{b_j}{\lambda_j- \lambda_i}$ for those $j$ with $b_j \neq 0$, $z_j=0$ for all other $j \neq i$. If this vector satisfies $\|z\|^2=1$, then it is a feasible KKT point.
After fixing $z_j=0$ for any $j$ for which $b_j=0$, for the other $j$ we have $$z_j=\frac{b_j}{\lambda_j+2\mu}$$ where $\mu$ can be any solution of the following equation: $$\sum_{j\in[n]: b_j\neq 0} \left (\frac{b_j}{\lambda_j+2\mu} \right )^2=1.$$
After, fining all the feasible KKT points $z^{1},\dots, z^{m}$ (note that $m\ge 1$ as the feasible space is compact and the objective is continuous), we need to determine the one (or those) that maximizes the objective $z^T\Lambda z-b^Tz+y^Ty$, let us denote it by $z^*$. Finally, $x^*$ is any solution that satisfies $z^*=Qx^*$.
Remark 1: In Step 2, we need to find the roots $\mu$ of the following equation $$\sum_{j=1}^n \left (\frac{b_j}{\lambda_j+2\mu} \right )^2=1,$$ which involves a polynomial of degree $2n$. As the function has a special structure, it can be proven that it always has at least two roots and can have up to $2n$ roots (see here). Thus, when none of the elements of $b$ is zero, we can have up to $2n$ candidate feasible KKT points. If $k$ elements of $b$ are zero, the maximum number of roots is $2(n-k)$ and we can have up to $k$ feasible KKT points from Step 1. Thus, the number of candidate feasible KKT points obtained in Steps 1 and 2 can be from 2 up to $2n$.
Remark 2: For the special case of $b = 0$, there are up to $n$ candidate feasible KKT points, obtained in Step 1 by setting $-2\mu$ to each of $\lambda_j, j=1,...,n$ (which are the eigenvalues of $L^TL$). The $n$ feasible KKT points are $$z^1=e_1,\dots,z^n=e_n.$$ Note that in this case, the feasible KKT corresponding to the largest $\lambda_j$ (which is the largest eigenvalue of $L^TL$) becomes the optimal solution and the optimal value becomes $\lambda_{\max}(L^TL)$, i.e., $$\max_{\|x\|^2 \le 1} \, x^TL^TLx= \max_{\|z\|^2=1} \, z^T\Lambda z=\lambda_{\max}(L^TL),$$ which is a known result.
Remark 3 ($\color{red}{\text{Update 1}}$):
I just realized that, when none of the elements of $b$ is zero, the KKT point defined by the smallest root (denote it by $\mu_{min}$) of the equation:
$$\sum_{j=1}^n \left (\frac{b_j}{\lambda_j+2\mu} \right )^2=1,$$
is the optimal solution. Let me first show that the smallest root is always negative. As all $\lambda_j, j=1,\dots, n$ are non-negative, the above equation always has a negative root in the interval
$$\left (-\infty, -\frac{\max_{j \in [n] }(\lambda_j)}{2} \right), $$
by observing that the LHS of the equation tends to $0$ as $\mu \to -\infty$ and tends to $+\infty$ as $2\mu \to \left (-\max_{j \in [n] }(\lambda_j) \right )^-$. Hence, we always have
$$2\mu_{min}<-\max_{j=1,\dots, n}\lambda_j \le 0.$$
Indeed, the feasible KKT vector $\tilde{z}$ defined by $\mu_{min}$ with
$$\tilde{z}_j=\frac{b_j}{\lambda_j+2\mu_{min}}, j=1,\dots, n$$
is the only candidate whose sign is opposite to the sign of the vector $b$ (located in the opposite orthant), that is,
$$\color{blue}{\text{sgn} (\tilde{z} )=-\text{sgn}(b)}.$$
The above property allows the feasible KKT point $\tilde{z}$ determined by the smallest root $\mu$ to have the longest distance from the point $b$ among the other feasible KKT points.
Remark 4 ($\color{red}{\text{Update 2}}$):
Using the observation made in Remark 3, we can conclude that if for some $i$ we have
$$b_i=0$$ $$\lambda_i =\max_{j \in [n] }(\lambda_j)$$ $$\sum_{j \in [n]: \lambda_j \neq \lambda_i }\left(\frac{b_j}{\lambda_j-\lambda_i} \right )^2<1,$$
then the feasible KKT point obtained for $2\mu=-\lambda_i$ from Step 1 is the optimal solution, that is, the following point (or points if we have $j \neq i$ with $\lambda_j=\lambda_i$):
$$\sum_{j \in [n]: \lambda_j=\lambda_i }z^2_j=1- \sum_{j \in [n]: \lambda_j \neq \lambda_i}\left(\frac{b_j}{\lambda_j-\lambda_i} \right )^2, \\ z_j= \frac{b_j}{\lambda_j-\lambda_i} ,j \in [n]: \lambda_j\neq\lambda_i.$$
Otherwise, the feasible KKT point obtained in Step 2 for the smallest root of the equation:
$$\sum_{j \in [n]: b_j \neq 0} \left (\frac{b_j}{\lambda_j+2\mu} \right )^2=1,$$
is the optimal solution.
This follows from the fact that the optimal solution of the problem continuously depends on $b$. To reach the above observation, for those $j$ with $b_j=0$, we can define $b_j=0+\epsilon $ and then use the point in Remark 3. Note that for $i$ with $b_i=\epsilon$ and $\lambda_i =\max_{j \in [n] }(\lambda_j)$, $2\mu_{\min} \to - \max_{j \in [n] }(\lambda_j)$ as $\epsilon \to 0^- .$
Remark 5: The problem is a non-convex optimization problem (known as convex maximization, or equivalently concave minimization), but it can be reformulated as a semi-definite programming model, which is a subclass of convex optimization, I have not provided it as the OP seeks a closed-form solution.