Sigmoid function with a longer, straighter middle

544 Views Asked by At

Can I modify the following function $y = 1 / (1 + e^{-x})$ such that I can make the curve straighter in the middle?

I would like the function to resemble a straight line until it approaches $0$ or $1$.

Here is an example diagram, the red line is desired:

example diagram

4

There are 4 best solutions below

0
On BEST ANSWER

An Answer: There is no Answer

We could try to generalize the expression a bit to get a "straighter" curve. For example, the expression can be generalized to $\newcommand{\eqd}{\triangleq}$ $\newcommand{\brp}[1]{{\left(#1\right)}}$ $\newcommand{\brs}[1]{{\left[#1\right]}}$ $\newcommand{\brl}[1]{{\left.#1\right|}}$ $\newcommand{\deriv} [2] {{\frac{\mathrm{d}#1}{\mathrm{d}#2} }}$ $\newcommand{\R}{\Bbb{R}}$ $$y \eqd \frac{a_1}{a_2 + a_3e^{-bx}} + a_4$$ And the minimum constraints are $$\begin{align*} \lim_{x\to-\infty}y &= 0 \\\lim_{x\to+\infty}y &= 1 \\\brl{y}_{x=0} &= \frac{1}{2} \end{align*}$$ But under the constraints given/implied, there is no solution; that is, there is no way, under the above constraints, to "straighten" it while holding its "steepness" unchanged. Here is the reason$\ldots$

$$\begin{align*} 0 &= \lim_{x\to-\infty}y \\&\eqd \lim_{x\to-\infty} \brs{\frac{a_1}{a_2 + a_3e^{-bx}} + a_4 } && \text{by definition of $y$} \\&= \brs{0 + a_4 } \\&\implies \boxed{a_4=0} \implies y = \frac{a_1}{a_2 + a_3e^{-bx}} \\\\ 1 &= \lim_{x\to\infty}y \\&\eqd \lim_{x\to\infty} \brs{\frac{a_1}{a_2 + a_3e^{-bx}} + 0 } \\&= \lim_{x\to\infty} \brs{\frac{a_1}{a_2 + 0}} \\&\implies \boxed{a_1=a_2}\implies y = \frac{a_1}{a_1 + a_3e^{-bx}} \\\\ \frac{1}{2} &= \brl{y}_{x=0} \\&= \frac{a_1}{a_1 + a_3e^{-b\cdot0}} + 0 \\&= \frac{a_1}{a_1 + a_3} \\&\implies 2a_1=a_1+a_3 \\&\implies\boxed{a_1=a_3} \\&\implies y = \frac{a_1}{a_1 + a_1e^{-bx}} \\&\implies y = \frac{1}{1 + e^{-bx}} \end{align*}$$

You still have one parameter $b$ that you can change$\ldots$ but changing that will change the ``steepness" of the curve, where here steepness is defined as the derivative $\deriv{y}{x}$ at $x=0$: $$\brl{\deriv{}{x}y}_{x=0} = \brl{\frac{-(-be^{-bx})}{(1+e^{-bx})^2}}_{x=0}=\frac{b}{4}$$ But if you don't want to change that (as you implied in your previous comment), then you have no degrees of freedom left and there is no way to straighten the curve without resorting to changing to what may be considered more "drastic" measures like changing the function altogether.

1
On

You can try plugging this in Desmos.com $$f(x) = \frac{1}{1+e^{-cx}}$$ add a slider for $c$ and then vary it to see how to function changes. I would say that by setting $c>4$, you get closer to your desired red line.

0
On

A function is not the same thing as a mathematical expression. Every curve that you can draw/think of is already a function (as long as it does not cross the same vertical line twice), so the picture in your question already defines a function.

Once a function is defined, such as in your case, there are lots of mathematical methods to help you describe it via algebraic expressions. Examples are Stone's Theorem (polynomial approximations), Fourier series (sines and cossines) and all sorts of other powerful methods!

0
On

Alternate Answer: Change the Question

$\newcommand{\eqd}{\triangleq}$ $\newcommand{\eqa}{\approx}$ $\newcommand{\abs}[1]{{\left\lvert #1 \right\rvert}}$ $\newcommand{\brp}[1]{{\left(#1\right)}}$ $\newcommand{\brs}[1]{{\left[#1\right]}}$ $\newcommand{\brlr}[1]{\left.#1\right|}$ $\newcommand{\deriv} [2] {{\frac{\mathrm{d}#1}{\mathrm{d}#2} }}$ $\newcommand{\R}{\Bbb{R}}$ $\newcommand{\intcc} [2] {{\left[#1:#2\right]}}$ $\newcommand{\intoo} [2] {{\left(#1:#2\right)}}$ $\newcommand{\intoc} [2] {{\left(#1:#2\right]}}$ $\newcommand{\intco} [2] {{\left[#1:#2\right)}}$ $\newcommand{\ff}{\mathrm{f}}$ $\newcommand{\fg}{\mathrm{g}}$ $\newcommand{\fphi}{\mathrm{\phi}}$ $\newcommand{\dx}{\mathrm{dx}}$ $\newcommand{\du}{\mathrm{du}}$ $\newcommand{\dv}{\mathrm{dv}}$

The original question was in essence how to modify the function $y\eqd\frac{1}{1+e^{-4mx}}$ such that it's slope $m$ about $x=0$ was approximately maintained for a greater interval $x\in\intcc{-a}{a}$. Previously it was shown that this game is not winnable under the rules of the game specified by the equation. So, given that there is no solution, of course one solution is to change the question.

To find a function $\ff(x)$ that more or less satisfies the original requirements, we could start by listing what the requirements are and/or think they should be and/or what would help make life interesting: $$\begin{align*} \lim_{x\to-\infty}\ff(x) &= 0 && \text{(constraint 1)} \\\lim_{x\to+\infty}\ff(x) &= 1 && \text{(constraint 2)} \\\ff(0) &= \frac{1}{2} && \text{(constraint 3)} \\\forall x\in\R\,,\ff(x) &> 0 && \text{(constraint 4)} && \text{(positive)} \\\forall x\in\R\,,\ff(x) &< 1 && \text{(constraint 5)} && \text{(bounded)} \\x_1<x_2\implies \ff(x_1)&<\ff(x_2) && \text{(constraint 6)} && \text{(strictly monotonically increasing)} \\\forall \abs{x}<0.45\,,\deriv{}{x}\ff(x) &\eqa 1 && \text{(constraint 7)} \\\forall |x|>0.55\,,\deriv{}{x}\ff(x) &\text{ is "very small"} && \text{(constraint 8)} \\\ff(x)&\in C^{\infty} && \text{(constraint 9)} && \text{(Smooth---continuous in all derivatives)} \end{align*}$$ Since so much of what we want in $\ff(x)$ is constrained by the derivative $\fphi(x)\eqd\deriv{}{x}\ff(x)$, we could start by defining $\fphi(x)$ and then integrating $\fphi(x)$ to recover $\ff(x)$, as in $\ff(x)=\int_{-\infty}^x\fphi(u)\du$. To do this, we can look at the constraints and see what that implies ($\implies$) about $\fphi(x)$: \begin{align*} \text{(constraint 7)} &\implies &\fphi(x) &= 1 && \text{for $\abs{x}<0.45$} \\\text{(constraint 8)} &\implies &\fphi(x) &\eqa 0 && \text{for $\abs{x}>0.55$} \\\text{(constraints 2, 5)} &\implies &\int_{-\infty}^{\infty}\fphi(x)\dx &=1 \\\text{(constraints 4, 6)} &\implies &\fphi(x) &> 0 && \forall x\in\R \\\text{(constraint 1)} &\implies &\lim_{x\to-\infty}\fphi(x) &=0 \\\text{(constraint 3)} &\implies &\fphi(-x)&=\fphi(+x) && \text{(symmetric about $x=0$)} \end{align*}

The (constraint 9) should fall out naturally from the method, using integration, which tends to smooth out discontinuous functions into smooth ones (as reflected in the legacy of the Fourier Expansion in some ways trumping the Taylor Expansion).

Finding such a $\fphi(x)$ is a challenge in and of itself. But in looking for a sigmoid function $\ff(x)$, we may want to start with a $\fphi(x)$ that is itself a function of a sigmoid, but with perhaps better promise of finding its anti-derivative in the literature (with, say, a $\tanh$ function or something). One sigmoid function that is similar to $\sigma(x)=\frac{1}{1+e^{-4mx}}=\frac{e^{4mx}}{1+e^{4mx}}$ is \begin{align*} \fg(x) &= \frac{1}{2} \brs{ 1+\tanh(bx) } = \frac{1}{2} \brs{ 1+\frac{e^{bx}-e^{-bx}}{e^{bx}+e^{-bx}} } = \frac{1}{2} \brs{ 1+\frac{1-e^{-2bx}}{1+e^{-2bx}} } = \frac{1}{2} \brs{ 1+\frac{2-1-e^{-2bx}}{1+e^{-2bx}} } \\&= \frac{1}{2} \brs{ 1+2\frac{1}{1+e^{-2bx}} - \frac{1+e^{-2bx}}{1+e^{-2bx}} } = \frac{1}{2} \brs{ 1+2\sigma(2bx) - 1 } = \sigma(2bx) \end{align*} enter image description here

Well, that certainly doesn't look like the $\phi$ we are desperately looking for; but, we could use the old trick of forming a polynomial, not of $x$, but of the function $\fg(x)$. Say, let $y\eqd\fg(x)$ and let $\phi(x)\eqd y^3$ for example. This is an ``old" trick because polynomials of cosines for approximation was used by Chebyshev (Chebyshev Polynomials where the harmonic form can be converted to polynomial form) and in the computation of Daubechies-$p$ scaling functions (in wavelet theory).

You can experiment around with the function $\fphi(x)\eqd 1-\tanh^p(bx)$ to find values of $b$ and $p$ that give you a $\phi(x)$ that is close to the requirements listed above (welcome to use the R code below). If you happen to pick $b=10$ and $p=6180$, then you just so happen to have picked the same one I did (amazing!):

enter image description here

To get $\ff(x)$ from this $\phi(x)$, simply integrate from $-\infty$ to $x$: $$\ff(x)\eqd \int_{-\infty}^x \phi(u)\du = \int_{-\infty}^x \brs{1 - (\tanh(bu))^p}\du$$ I say "simply" in the sense that a mathematician might say it---which means it's not really that simple. But with a little help from I.S. Gradshteyn and I.M. Ryzhik, the integral can be solved (but maybe not how one might like it to be solved): \begin{align} \boxed{\ff(x)} &\eqd \int_{u=-\infty}^{u=x} \phi(u)\du \\&\eqd \int_{u=-\infty}^{u=x} \brs{1 - \tanh^p(bu)}\du \\&= \int_{u=-\infty}^{u=x}1\du - \int_{u=-\infty}^{u=x} \tanh^p(bu)\du \\&= \int_{u=-\infty}^{u=x}1\du - \int_{u=-\infty}^{v/b=x} \tanh^p(v)\frac{1}{b}\dv && \text{where $v\eqd bu$ $\implies$ $\du = \frac{1}{b}\dv$} \\&= \int_{u=-\infty}^{u=x}1\du - \frac{1}{b}\int_{u=-\infty}^{v=bx} \tanh^{2n}(v) \dv && \text{where $n\eqd p/2$} \\&= \brlr{u}_{u=-\infty}^{u=x} - \frac{1}{b}\brlr{v}_{v=-\infty}^{v=bx} + \brlr{\frac{1}{b}\sum_{k=1}^n \frac{\tanh^{2n-2k+1}(v)}{2n-2k+1}}_{v=-\infty}^{v=bx} && \text{by Gradshteyn and Ryzhik page 119} \\&= \boxed{\frac{1}{b}\sum_{k=1}^n \frac{\tanh^{2n-2k+1}(bx)}{2n-2k+1} - \frac{1}{b}\sum_{k=1}^n \frac{(-1)}{2n-2k+1}} \end{align} And so now we should have something close to what was originally asked for:

enter image description here

#============================================================================
# R script file
#============================================================================
#---------------------------------------
# packages
#---------------------------------------
#install.packages("stats");
#install.packages("R.utils");
#install.packages("rootSolve");
 require(stats);
 require(R.utils);
 require(rootSolve);
 rm(list=objects());

#---------------------------------------
# Data
#---------------------------------------
 x = seq( from=-2.5, to=2.5, length=1000 )

#---------------------------------------
# Logistic sigmoid function
#---------------------------------------
linear = function(x,slope,yintercept)
{
   result = slope * x + yintercept
}

#---------------------------------------
# Logistic sigmoid function
#---------------------------------------
sigmoid_logistic = function(x,slope)
{
   b = 4 * slope
   result = 1 / (1 + exp(-b*x))
}

#---------------------------------------
# Logistic sigmoid function
#---------------------------------------
sigmoid_tanh = function(x,slope)
{
   b = 4 * slope
   result = (exp(x) - exp(-x)) / (exp(x) + exp(-x))
}

#---------------------------------------
# Logistic sigmoid to power p
#---------------------------------------
sigmoid_logp = function(x,slope,p)
{
   a = 2^(1/p)-1
   b = slope * (1 + a)^(2 + p -1) / (a * p )
   result = (1 / (1 + a*exp(-b*x)))^p 
}

#---------------------------------------
# tanh^p(bx)
#---------------------------------------
#tanhp = function(x,b,p)
#{
#   result = 0.5 * (((exp(b*x) - exp(-b*x)) / (exp(b*x) + exp(-b*x)))^p + 1)
#}

#---------------------------------------
# Logistic sigmoid function
# https://books.google.com/books?id=OeUKAAAAYAAJ&pg=PA15
# https://archive.org/details/integralstable00peirrich/page/59/
# https://archive.org/details/integralstable00peirrich/page/81/
#---------------------------------------
tanhp = function(x,b,p)
{
  #result = 1 - ((exp(b*x) - exp(-b*x)) / (exp(b*x) + exp(-b*x)))^p
   result = 1 - (tanh(b*x))^p
}

#---------------------------------------
# Logistic sigmoid function
# https://ia800806.us.archive.org/7/items/GradshteinI.S.RyzhikI.M.TablesOfIntegralsSeriesAndProducts/Gradshtein_I.S.%2C_Ryzhik_I.M.-Tables_of_integrals%2C_series_and_products.pdf
# 2.424 (3) page 119
#---------------------------------------
inttanhp = function(x,b,p)
{
   n = p/2
   inta = 0
   for( k in c(1:n) )
   {
     inta = inta + (-1) / (2*n-2*k+1)
   }
   inta = inta / b
   intx = 0
   for( k in c(1:n) )
   {
     intx = intx + (tanh(b*x))^(2*n-2*k+1) / (2*n-2*k+1)
   }
   intx = intx / b
   result = intx - inta
}

#---------------------------------------
# Display
#---------------------------------------
 colors = c( "blue", "red", "orange", "black", "purple","green");
#plot ( x, tanhp(x,2,1) , col=colors[1], lwd=3, type='l', xlab="x", ylab="y" )
#plot ( x, tanhp(x,1/(0.55-0.45),6180) , col=colors[1], lwd=3, type='l', xlab="x", ylab="y", ylim=c(-0.1,1.1), xlim=c(-1.5,1.5))
 plot ( x, inttanhp(x,1/(0.55-0.45),6180) , col=colors[2], lwd=3, type='l', xlab="x", ylab="y" )
#lines( x, linear(x,1,1/2)       , col=colors[6], lwd=2, type='l' )
#lines( x, linear(x,0,1/2)       , col=colors[6], lwd=2, type='l' )
 legend("topleft", legend="f(x); b=10; p=6180", col="red", lwd=3, lty=1:1)
 grid()