How to solve the equations of linear combination of sigmoid functions?

1.3k Views Asked by At

Let $\sigma(x)=\frac{1}{1+e^{-x}}$ be the sigmoid function.

How to solve such kind of equations? \begin{align*} \sigma(x+y)+\sigma(x-y)=a\\ \sigma(2x+y)+3\sigma(3x-y)=b\\ \end{align*}

I guess this kind of equations are related to neural networks. I don't know how to solve it. Thank you very much.

2

There are 2 best solutions below

3
On BEST ANSWER

If $e^{-x} = s$ and $e^{-y} = t$, the system becomes $$ \eqalign{ {\frac {s{t}^{2}+s+2\,t}{ \left( st+1 \right) \left( t+s \right) }}&=a\cr {\frac {3\,{s}^{2}{t}^{2}+{s}^{3}+4\,t}{ \left( {s}^{2}t +1 \right) \left( {s}^{3}+t \right) }}&=b } $$ Multiply by the denominators and you have a system of polynomial equations. Take the resultant with respect to $t$, disregard factors of $s$ and $s-1$, and you get a rather awful irreducible polynomial in one variable $s$ of degree $9$: $$ \left( {a}^{2}{b}^{2}-2\,a{b}^{2}+{b}^{2} \right) {s}^{9}+ \left( {a} ^{2}b-a{b}^{2}-ab+{b}^{2} \right) {s}^{8}+ \left( -{a}^{2}{b}^{2}+4\,{ a}^{2}b-4\,ab+{b}^{2} \right) {s}^{7}+ \left( -{a}^{2}{b}^{2}+{a}^{2}b +3\,a{b}^{2}+3\,{a}^{2}-7\,ab-{b}^{2}+2\,b \right) {s}^{6}+ \left( -{a }^{2}{b}^{2}+2\,{a}^{2}b+4\,a{b}^{2}+4\,{a}^{2}-12\,ab-2\,{b}^{2}-2\,a +6\,b+1 \right) {s}^{5}+ \left( {a}^{2}{b}^{2}-6\,{a}^{2}b+4\,{a}^{2}+ 4\,ab-2\,{b}^{2}-2\,a+6\,b-5 \right) {s}^{4}+ \left( {a}^{2}{b}^{2}-7 \,{a}^{2}b-a{b}^{2}+9\,{a}^{2}+11\,ab-{b}^{2}-16\,a+4 \right) {s}^{3}+ \left( {a}^{2}{b}^{2}-4\,{a}^{2}b-4\,a{b}^{2}+20\,ab+3\,{b}^{2}-16\,a -16\,b+16 \right) {s}^{2}+ \left( {a}^{2}b-a{b}^{2}-4\,{a}^{2}+5\,ab+{ b}^{2}-4\,a-6\,b+8 \right) s-{a}^{2}{b}^{2}+8\,{a}^{2}b+2\,a{b}^{2}-16 \,{a}^{2}-16\,ab-{b}^{2}+32\,a+8\,b-16 =0$$

So you're not going to get a nice "closed form" solution, although this still might be useful in some cases. Numerical methods are probably the way to go.

EDIT: Given particular values of $a$ and $b$, things might not be too bad. For example, let's try $a = 5/4$, $b = 31/10$. The polynomial in $s$ is $$961\,{s}^{9}-2294\,{s}^{8}-2449\,{s}^{7}+29\,{s}^{6}+563\,{s}^{5}-667 \,{s}^{4}+279\,{s}^{3}+513\,{s}^{2}-54\,s-81 $$ This has three real roots (as can be confirmed using Sturm's theorem). To get a numerical value, you need numerical methods. In this case the values are approximately $s =.4709506665, .5829071971, 3.174092780 $. The corresponding $t$ values are approximately $.2667682226$, $0.6205142034$, $-0.06730377237$. We're not interested in the third solution since we need $s,t>0$. The first two give $(x,y) = (-\log(s), -\log(t)) = (.7530019325, 1.321375078)$ and $(.5397272869, .4772067843)$ respectively.

0
On

If $z$ is small, you can write $$\sigma(z)=\frac{1}{1+e^{-z}}=\frac{1}{2}+\frac{z}{4}+O\left(z^2\right)$$ and then the solutions of the equations write $$x=2 (a-1)$$ $$y=11 a-2 b-7$$ If this is not the case, Robert Israel gave the rigorous answer.