How to find the minimum of the function?

179 Views Asked by At

How to find the minimum of the following function

$$ {\rm f}\left(w\right) = {1 \over 2}\sum_{i = 1}^{n}\left({1 \over 1 + {\rm e}^{-x_{i}\,w}} -y_{i}\right)^{2} $$

where $x_{i}, y_{i} \in \left(0, 1\right)$ are constants, $w\in \mathbb{R}$?

Could you find a analytic or computational way to get the minimum of the function ?.

5

There are 5 best solutions below

4
On BEST ANSWER

I suppose that this is a nonlinear least square fit problem in which you have data points [x(i) , y(i)] and you want to adjust the parameter w. If you establish the derivative of f(w) with respect to w, you have one (not too complex) equation to solve but it can easily be done using Newton method. The problem is to start with a reasonable value; you can have one rewriting x(i) as a function of y(i). Going to logarithms, you will see that x(i) is along a straight line of Log[y(i) / (1-y(i)] and the slope of this line is w. So, you have everything to start.

4
On

If you are finding the minimum with respect to $w$, the domain of $w$ will effect what your minimum will be. So is $w\in \mathbb{R}$ or is it that $w$ can only take a particular set of values?

8
On

Given your function $f(w)$, its derivative wrt $w$ is given by $$\begin{align} \frac{df}{dw}&=\sum\limits_{i=1}^n\left(\frac{1}{1+\exp(-x_iw)}-y_i\right)\cdot\frac{x_i\exp(-x_iw)}{\left(1+\exp(-x_iw)\right)^2}\\ &=\sum\limits_{i=1}^n\frac{\left(1-y_i\left(1+\exp(-x_iw)\right)\right)\cdot x_i\exp(-x_iw)}{\left(1+\exp(-x_iw)\right)^3}, \end{align}$$ which is a quite complicated expression when attempting to find the root analytically.

A number of computational methods exist for finding the minimum of $f(w)$. The class of solvers which is applicable to this problem solve unconstrained non-linear optimisation problems, of which a simple google search reveals many possibilities.

9
On

To try to clarify things, I generated 10 values [x(i) = i / 10] and the corresponding values y[i] are [0.58, 0.65, 0.72, 0.78, 0.83, 0.87, 0.90, 0.92, 0.94, 0.96].

Based on these, I wrote function f(w) as given in your post (sum of 10 terms). When plotted as a function of w, f(w) exhibits (as totally normal) a paraboloid shape with a marked minimum around 3.15 (to give you an idea, for f(2.0)=0.0337713, f(2.5)=0.00852217, f(3.0)=0.000357495, f(3.5)=0.00181296, f(4.0)=0.00843496. The absolute minimum corresponds to w=3.13938 for which f(w)=0.000031942 and this is the solution.

Using the second approach, I wrote function f'(w). Ploted against w, this function has a very nice shape and becomes exactly zero at w=3.13938 and this is the solution.

You must remember than solving an equation is much simpler than minimizing a function (almost if not constrained). For illustration purposes, to solve f'(w)=0, I used Newton method starting at w=2.0 (value which is far away from the one I suggested you to generate using some changes). The successive iterates for w are 2.45293, 2.75188, 2.93280, 3.03609, 3.09582, 3.13997.

3
On

From a numerical point of view, your problem looks quite iteresting and I played with it, my concern being to get a quick estimate of the parameter "w" which has to be adjusted to your data.

What I did was to start a Newton procedure at w=0. As a result, what I obtained is that a rough estimate of "w" can be obtained using the following formula for the estimate
w = 2 (2 S3 - S1) / S2
in which S1 is the sum of the x(i), S2 is the sum of x(i)^2 and S3 is the sum of x(i) * y(i).

For an exact value of w = 1, 2 or 3, the corresponding estimates are is 0.95, 1.66 and 2.11. The corresponding "experimental" data were x(i) = i / 10 and the y(i) were calculated using the exact formula.

In the case I previously illustrated (with noise in the y data), this procedure would lead to an estimate equal to 2.16 for an excat value of 3.14.