Is this property a theorem somewhere?

86 Views Asked by At

Apologies if the question is trivial.

Consider the following statement.

Let $f(\mathbf{\theta})$ be a convex, smooth and differentiable function of parameters $\mathbf{\theta}\in \Re^p$. So, $arg \min\limits_{\mathbf{\theta}}\ f(\mathbf{\theta}) = \mathbf{\theta}^*$ exists unique.

Assume that there exists a function $g(\mathbf{\theta}^{i}) = \mathbf{\theta}^{i + 1}$ such that $f({\mathbf{\theta}}^{i + 1}) <$ $f(\mathbf{\theta}^{i})$. Then, $$ \lim\limits_{i\rightarrow\infty} f(g({\mathbf{\theta}}^{i})) = f(\mathbf{\theta}^{*}). $$ This should imply that $\lim\limits_{i\rightarrow\infty} g(\mathbf{\theta}^{i}) = \mathbf{\theta}^{*}$ and should be true for any such function $g$ (assuming some stop rule when $g(\mathbf{\theta}^{i}) = \mathbf{\theta}^{i}$).

Assuming that the above is true, I wonder if it is stated as a theorem somewhere. So, two questions:

  1. is the above correct for any such function $g$? (I can think of several specific examples, e.g. steepest descent)
  2. if so, is there a reference to it?
1

There are 1 best solutions below

1
On BEST ANSWER

It does not follow from $f(\theta_{i+1}) < f(\theta_i)$ that $\theta_i \rightarrow \theta^*$; Joe gives a good counterexample in the comments. However, it does follow that if $f(\theta_i) \rightarrow f(\theta^*)$, and $f$ is convex with a unique minimizer, then $\theta_i \rightarrow \theta^*$.

Intuition. From the definition of convexity, it follows that as we move in a straight line away from $\theta^*$, the value of $f(\theta)$ must increase; and conversely, as we move in a straight-line toward $\theta^*$ the value of $f$ decreases, so that as long as we stay away from $\theta^*$, $f$ can't get close to $f(\theta^*)$. This works fine in one dimension, but in more dimensions, the rate at which $f(\theta^*)$ increases as we move away $\theta^*$ may depend on the chosen line we travel. To get around this, we draw a circle around $\theta^*$ and take the minimal value of $f$, since that's the case where it's hardest to prove what we want, and use the straight-line argument in that direction from $\theta^*$.

Proof. Suppose $f(\theta_i) \rightarrow f(\theta^*)$. We may assume WLOG that $f(\theta^*) = 0$. Let $\delta > 0$; we want to show $\theta_i$ is eventually within $\delta$ of $\theta^*$. Let $\epsilon$ be the min** of $f(\theta)$ on the (hyper)-sphere $|\theta-\theta^*| = \delta$. By the uniqueness of the minimizer, $\epsilon > 0$. So eventually we must have $f(\theta_i) < \epsilon$. Then note that

$$t = \delta / |\theta_i - \theta^*|$$

gives

$$\big|((1-t)\theta^* + t\theta_i) - \theta^*\big| = \delta$$

and so $f((1-t)\theta^* + t\theta_i) \geq \epsilon$. Then

$$\epsilon \leq f((1-t)\theta^* + t\theta_i) \leq (1-t)f(\theta^*) + t f(\theta_i) = t f(\theta_i) = \frac{\delta f(\theta_i)}{|\theta_i - \theta^*|}$$

and so

$$|\theta_i - \theta^*| < \delta f(\theta_i) / \epsilon < \delta$$ since $f(\theta_i) < \epsilon$.

**: Taking the min assumes the sphere is compact, so let's say we live in a finite-dimensional space $\mathbb{R}^n$.