In fact, I'm not sure if the following question is convex.
I am processing an optimization problem about compressive sensing:
$$\arg\min_{A,x} \quad \lVert Ax-y \rVert _{2}^{2}+\lambda _1\lVert A^{\top}Ax-x \rVert _{2}^{2}+\lambda _2\lVert x \rVert _1$$
where $A$ should be an $m\times n$ $(m\ll n)$ matrix, $x$ should be an $n$-d vector, and $y$, $\lambda_1$, $\lambda_2$ are given. Especially, $A$ is variable intead of constant matrix.
Does we can solve this problem by using an iterative method like ISTA?
You can definitely use a proximal gradient method such as ISTA. Recall that the general framework is $$\min_\mathbf{x} f(\mathbf{x}) + g(\mathbf{x})$$ where $f(\mathbf{x})$ is differentiable and $g(\mathbf{x})$ is convex. Then actually it's possible to solve this in two ways:
Notice that $\|A^TA\mathbf{x}-\mathbf{x}\|^2_2=\|(A^TA-I)\mathbf{x}\|^2_2$. This will make it easier to get the derivative. Also, if you take the second approach, then deriving a proximal operator for $g(\cdot)$ is not difficult, since under certain conditions, which apply in this case, $\text{prox}_{g_1+g_2}=\text{prox}_{g_1}\circ\text{prox}_{g_2}$. The proximal operator of the $l_1$ norm is the well known soft thresholding operator, and getting an explicit prox formula for the second term in $g(\cdot)$ is also not complicated. Then all you have to do is compose them.
Lastly, I would recommend to use an accelerated prox-grad method such as FISTA to get much better performance than ISTA.