What is this optimization method called?

60 Views Asked by At

I am using a technique to do numerical optimization of a system but I don't know what it's called. I would like to be able to look it up in literature or books, but can't without knowing what I'm doing! Also, I'm not a mathematician so I apologise in advance for what is certainly non-standard notation...

I have an unknown function: $$f(x, \boldsymbol{p})$$

where $\boldsymbol{p} = (p_1, p_2, p_3, p_4, ..., p_n)$ is a vector of unknown parameters and $x$ is a spatial variable

I also have and a known objective function $g(x)$.

I want to find the parameters $\boldsymbol{p}$ such that the integral square error is minimized:

$$\epsilon = \int_{x=x_{min}}^{x_{max}} \left(f(x, \boldsymbol{p}) - g(x) \right)^{2} dx$$

For example, if $g(x) = 1$, then I'm trying to find the parameters such that $f(x, \boldsymbol{p}) \approx 1$ in the region $x_{min} < x < x_{max}$.

$f(x, \boldsymbol{p})$ is very slow to compute for different parameters (it's an FEA model) but once solved, I have its value for all $x$.

At this point I could throw a black box optimizer at the problem and try to minimize $\epsilon$, but this generally takes many iterations to converge for even a modest number of parameters. I realized that this is throwing away a lot of spatial information. So instead I did the following:

Lets say I start with a guess for my parameters $\boldsymbol{p_0}$

I can compute my result at a bunch of discrete $\boldsymbol{x}$ points to get a vector:

$\boldsymbol{b_0} = (f(x_1,\boldsymbol{p_0}), f(x_2, \boldsymbol{p_0}) ..., f(x_n,\boldsymbol{p_0}))$

I then compute some vectors of numerical partial derivatives at the same x points, which I've been calling "basis vectors":

$$\boldsymbol{b_1} = \left(\frac{\delta f(x_1)}{\delta p_1}, \frac{\delta f(x_2)}{\delta p_1}, ...\right)$$ $$\boldsymbol{b_2} = \left(\frac{\delta f(x_1)}{\delta p_2}, \frac{\delta f(x_2)}{\delta p_2}, ...\right)$$ $$\boldsymbol{b_n} = \left(\frac{\delta f(x_1)}{\delta p_n}, \frac{\delta f(x_2)}{\delta p_n}, ...\right)$$

So that's one evaluation of $f$ for each parameter to find a numerical gradient at each point in space.

I can then find a linear combination of these basis functions that minimizes the error :

$$\epsilon = \sum \left(\boldsymbol{b_0} + \sum_{n} a_n \cdot \boldsymbol{b_n} - \boldsymbol{g}\right)^{\circ\frac{1}{2}}$$

My next guess for each parameter is then $p_n' = p_n + a_n $, and we repeat until convergence. This is kind of like a really basic finite-difference Newton method, but vectorized in space.

In this way, I have been able to converge in a few steps... much faster than black-box single-parameter optimization.

What am I doing here?

edited to add: I don't think this is a general-purpose solution. I think it only works because my unknown function can be roughly approximated by a linear superposition of functions like so:

$$f(x, \boldsymbol{p}) \approx p_1 \cdot f_1(x) + p_2 \cdot f_2(x) + ... + p_3 \cdot f_3(x)$$