Practical limit on optimization tending to negative infinity

39 Views Asked by At

Consider $f:\mathbb R^n\to\mathbb R$ $$f(\mathbf x):=\sum_i{e^{x_i}}$$ The gradient of $f(\mathbf x)$ is the function itself. Notably, the gradient magnitude decreases exponentially, if the goal is to minimize $f(\mathbf x)$. In that case, the theoretical minimum tends to $-\infty$.

It would appear that for functions, where the minima tend to $-\infty$, ignoring other factors such as the stopping criterion, learning rate, and time, the practical limit of where $\mathbf x$ will end up eventually is only limited by the hardware precision, i.e, a very small negative number but not $-\infty$.

Additionally, if $\nabla_{\mathbf x}f(\mathbf x)$ decreases exponentially, then can such a practical "convergence" be expected to happen quickly (as in relation to $-\infty$) in practice subject to the learning rate and the stopping criterion (because the gradient vanishes)?

I suspect this might be a trivial question but I couldn't find a definite answer. Is there anything else to consider in such cases?

Also, the properties of $f(\mathbf x)$ at this practical finite value might be significantly different from its theoretical properties at $-\infty$. In such cases, the experiments might not give the expected results. So would it make any sense to study $f(\mathbf x)$ as $\mathbf x\to \mathbf k$ for some very small finite negative number $k$ such that $\mathbf k= [k, ..., k]$?