I am confused about the issue of convexity in least squares estimation problems. Equivalent formulations of the problem seem to give rise to different convexity results.
As an illustration, consider a linear least squares problem with a positivity constraint:
$$\hat{\alpha} = \arg\min_{\alpha >0} F(\alpha) \qquad F(\alpha)=\sum^N_{n=1} (\alpha y_n - g_n)^2$$
where $y_n$ and $g_n$ are known. $F(\alpha)$ is known to be convex. This can be verified by seeing that the second derivative $\frac{\partial ^2 F}{\partial \alpha^2}=2\sum^N_{n=1}y_n^2$ is non-negative.
Now, consider an equivalent non-linear parameterization ($\tilde\alpha=1/\alpha$) of the same problem:
$$\hat{\tilde\alpha} = \arg\min_{\tilde\alpha>0} \tilde{F}(\tilde{\alpha}) \qquad \tilde{F}(\tilde\alpha)=\sum^N_{n=1} (y_n/\tilde{\alpha} - g_n)^2$$ with $\hat{\tilde\alpha}=1/\hat{\alpha}$. The first derivative of $ \tilde{F}(\tilde{\alpha})$ is:
$$\frac{\partial \tilde{F}}{\partial \tilde{\alpha}}=-2\sum^N_{n=1} (y_n/\tilde{\alpha} - g_n)/\tilde{\alpha}^2. $$ Differentiating once again yields:
$$\frac{\partial^2 \tilde{F}}{\partial \tilde{\alpha}^2}=-2\sum^N_{n=1} \frac{\partial}{\partial \alpha}(y_n/\tilde{\alpha} - g_n)/\tilde{\alpha}^2 = \sum^N_{n=1} 6 y_n/\tilde{\alpha}^4 - 4g_n/\tilde{\alpha}^3$$
which, in general, may be positive or negative, implying that this new (but equivalent) optimization problem is non-convex.
Convexity requires:
$$\tilde{F}(c\tilde{\alpha_1} + (1-c)\tilde{\alpha_2}) \le c\tilde{F}(\tilde{\alpha_1}) + (1-c)\tilde{F}(\tilde{\alpha_2})\qquad\forall c\in[0,1],~~\forall \tilde{\alpha}_1,~\tilde{\alpha}_2>0$$
Next, consider a special case where:
$$y_n = g_n,~~c=.5,~~\tilde{\alpha}_1=2,~~\tilde{\alpha}_1=4.$$
Thus:
$$\tilde{F}(c\tilde{\alpha_1} + (1-c)\tilde{\alpha_2}) =\tilde{F}(3)=N 4/9$$
$$c\tilde{F}(\tilde{\alpha_1}) + (1-c)\tilde{F}(\tilde{\alpha_2}) =0.5\tilde{F}(2) + 0.5\tilde{F}(4)=0.5N(1/4+9/16)=N(1/8 + 9/32)$$
which means there are cases where $\tilde{F}(c\tilde{\alpha_1} + (1-c)\tilde{\alpha_2}) > c\tilde{F}(\tilde{\alpha_1}) + (1-c)\tilde{F}(\tilde{\alpha_2})$, which implies that the cost function $\tilde{F}(\tilde{\alpha})$ is not convex.
How is it that these two seemingly equivalent problem formulations have different convexity? And, if I haven't made errors above, does the absence of convexity for $\tilde{F}(\tilde{\alpha})$ mean that I should be concerned with local minima? I don't think that happens for this simple example.