Given input and output values of a function with unknown coefficients, find the optimal coefficients

236 Views Asked by At

Given the function $y = \frac{k_1x_1 + k_2x_2 + k_3x_4}{k_1 + k_2(x_2+x_3) + k_3}$ and many pairs of $x_1, x_2, x_3, x_4$ and corresponding output $y$, how to find optimal value of $k_1, k_2, k_3$ ?

Can anyone help me with it? Thanks.

Constrains: $k_1 + k_2 + k_3 = 1$ and $0 < k_1 < 1$, $0 < k_2 < 1$, $0 < k_3 < 1$

2

There are 2 best solutions below

1
On BEST ANSWER

I change my first answer in taking account of the change of wording of the question.

With the constraint $$k_1+k_2+k_3=1 \quad\to\quad k_3=1-k_1-k_2$$

The data is : $(x_{1,j}\:,\:x_{2,j}\:,\:x_{3,j}\:,\:x_{4,j}\:;\:y_{j})$ from $j=1$ to $j=n$. $$y_j \simeq \frac{ k_1x_{1,j} + k_2x_{2,j} +(1-k_1- k_2)x_{4,j} }{ k_1 + k_2(x_{2,j}+x_{3,j}) +(1-k_1- k_2) }$$

$$\left( k_1 + k_2(x_{2,j}+x_{3,j}) +(1-k_1- k_2) \right)y_j \:\simeq\; k_1x_{1,j} + k_2x_{2,j} +(1-k_1- k_2)x_{4,j}$$

$$ y_j-x_{4,j} \simeq k_1(x_{1,j}-x_{4,j}) + k_2\left(x_{2,j} - x_{4,j} +(1-x_{2,j}-x_{3,j})y_j \right) $$

Transform the initial data to the new data $(X_{1,j}\:,\:X_{2,j}\:;\:Y_j)$ $$\begin{cases} Y_j=y_j-x_{4,j}\\ X_{1,j}=x_{1,j}-x_{4,j}\\ X_{2,j}=x_{2,j} - x_{4,j} +(1-x_{2,j}-x_{3,j})y_j \end{cases}$$ $$Y_j\simeq k_1X_{1,j}+k_2X_{2,j}$$ A usual linear regression will straightforward gives optimized values of $k_1$ and $k_2$ and then $k_3=1-k_1-k_2$.

This very simple method is valid because no criteria of fitting is specified in the wording of the question.

If a particular criteria of fitting was specified, the problem would be more difficult. One would have to use a non-linear method of regression (for example of Levenberg-Marquardt kind) , involving an iterative process starting from guessed values of the parameters. The guess is is not always easy. The above method is useful to compute initial values of the parameters instead of guessed values.

1
On

This is a typical parameter fitting problem that can be solved with optimization methods. Let's denote your relationship as $y = f(\vec{k},\vec{x})$ and your data as $(y_n, \vec{x}_n)$. For some hypothesis $\vec{k}_\text{H}$ express the squared error as a cost function $$C(\vec{k}_\text{H}) = \sum_{n=1}^N (y_n - f(\vec{k}_\text{H},\vec{x}_n))^2$$ and find your true coefficients $\vec{k}$ by cost function minization $$\vec{k} \in \text{argmin} \ C(\vec{k}_\text{H})$$ which can be tackled with gradient-based solvers like Levenberg-Marquardt or trust region. You'll encounter the typical problems with local cost function minima.

EDIT

Aniruddha Deshmukh is right, this can be transformed to a linear problem. Every data gives you an equation $$(x_{1,n} - y_n) k_1 + (x_{2,n} - y_n x_{2,n} - y_n x_{3,n}) k_2 + (x_{4,n}-y_n ) k_3 = 0$$ for $n = 1 \ldots N$ so you'll get an overdetermined linear system of $N$ equations in three variable and can compute the least squares solution (e.g., with the pseudo-inverse). That's messed up though because it's a homogeneous system that would give all zeros as solution. However, note that you can apply any scaling to your $k$-values without changing the function because of the fraction. Thus you can fix one of the coefficients, e.g., set $k_3 = 1$. This way you obtain an inhomogeneous system of equations $$(x_{1,n} - y_n) k_1 + (x_{2,n} - y_n x_{2,n} - y_n x_{3,n}) k_2 = y_n -x_{4,n}$$ and non-vanishing $k_1, k_2$.