Using generalized logistic curve to create a mathematical model from data.

Question

Using generalized logistic curve to create a mathematical model from data.

95 Views Asked by Bumbble Comm At 26 Mar 2026 - 10:48

The first row is time and the second row is height of a plant. We need to use generalized logistic curve to model the behavior of the plant. The equation of the logistic curve is : $$N = \frac{N_*}{1+(N_*/N_0-1)e^{-a_0t}}$$ where $N_*$ is the maximum height of the plant or in other words, the supremum of the logistic function.

$a_0$ is how fast the function increases. My question is the following:

I know how to determine $N_*$, we just make up a number that is above $251$ and less than $251 + (251 - 247) $. But how to exactly determine $a_0$. I am quite lost here. How I would do it is to calculate every $a_0$ for every time and height, with equation $a_0 = \frac{1}{t}\ln(\frac{N}{N_0})$. In this equation $N$ would be the difference between heights, so let's say between $18$ and $33$ the $N$ for time equals $1$ would be $33 - 18$. Is the process here of creating a table of $\sum_{k = 1}^{10}(10 - i)$ column elements of differences, so that we take multiple $N_0$ until we run out of data from the primary table in the picture (so until $N_0 = 9$). Then we can use a formula for $a_0 = \frac{\sum a_{0l}*t_l}{\sum t_l}$ where $l$ is the number of elements.

Is my procedure for creating a logistic out of the data provided correct?

Original Q&A

There are 4 best solutions below

Bumbble Comm On 23 Jan 2021 - 1:11

The answer by @DanielJ.Greenhoe is very complete but it plays along with the initial premisse that you just pick $N_*$ in a compatible way. However, there is no reason why the minimisation process is not performed for the parameters $(N_*, a_0)$, instead of just $a_0$.

This leeds to the set of parameters $N_* = 255.844, a_0 = 0.65392$.

user122283 On 28 Jan 2021 - 11:29

Method 2: Two degrees of freedom

$\newcommand{\eqd}{\triangleq}$ $\newcommand{\brp}[1]{{\left(#1\right)}}$ $\newcommand{\brs}[1]{{\left[#1\right]}}$ $\newcommand{\norm}[1]{\left\lVert #1 \right\rVert}$ $\newcommand{\pderiv}[2]{{\frac{\partial#1}{\partial#2} }}$ $\newcommand{\opair}[2]{\left( #1,#2\right)}$ $\newcommand{\R}{\Bbb{R}}$

Method 1 gave an expression for $\pderiv{}{a_0}N(t)$; here is a similar lemma for $\pderiv{}{N_*}N(t)$: $$\begin{align*} \boxed{\pderiv{}{N_*}N(t)} &\eqd \pderiv{}{N_* }\brs{\frac{N_*}{1+\brp{\frac{N_*}{N_0}-1}e^{-a_0 t}}} && \text{by definition of $N(t)$} \\&= \frac{\brs{1+\brp{\frac{N_*}{N_0}-1}e^{-a_0 t}} - N_*\brs{\frac{1}{N_0}}e^{-a_0 t}} {\brp{1+\brs{\frac{N_*}{N_0}-1}e^{-a_0 t}}^2} && \text{by Quotient Rule} \\&= \frac{1 - e^{-a_0 t}} {\brp{1 + \brs{\frac{N_*}{N_0}-1}e^{-a_0 t}}^2} \\&= \brs{\frac{1 - e^{-a_0 t}} {N_*^2}} \brs{\frac{N_*} {1 + \brs{\frac{N_*}{N_0}-1}e^{-a_0 t}}}^2 \\&\eqd \boxed{\brs{\frac{1 - e^{-a_0 t}}{N_*^2}} N^2(t)} && \text{by definition of $N(t)$} \end{align*} $$ Then$\ldots$ $$\begin{align*} \boxed{0} &= \frac{N_*^2}{2}\cdot0 \\&=\frac{N_*^2}{2} \pderiv{}{N_*}\norm{e}^2 \\&\eqd \frac{N_*^2}{2} \pderiv{}{N_*}\sum_{n=0}^{n=10}e^2(t_n) && \text{by definition of $\norm{\cdot}$} \\&\eqd \frac{N_*^2}{2} \pderiv{}{N_*}\sum_{n=0}^{n=10}\brs{N(t_n)-y_n}^2 && \text{by definition of $e$} \\&= \frac{N_*^2}{2} \sum_{n=0}^{n=10}2\brs{N(t_n)-y_n}\pderiv{}{N_h}N(t_n) && \text{by Chain Rule} \\&= N_*^2 \sum_{n=0}^{n=10} \brs{N(t_n)-y_n} \brs{\frac{1 - e^{-a_0 t_n}}{N_h^2}} N^2(t_n) && \text{by lemma} \\&= \boxed{\sum_{n=0}^{n=10} N^2(t_n) \brs{N(t_n)-y_n} \brs{1 - e^{-a_0 t_n} }} \end{align*}$$

So now we have two equations in two unknowns ($N_*$ and $a_0$): $\pderiv{}{N_h}\norm{e}^2=0$ and $\pderiv{}{a_0}\norm{e}^2=0$ (from Method 1). You can use a multi-root solver to find the solution $\opair{N_*}{a_0}$ to the equation pair $\opair{\pderiv{}{N_h}\norm{e}^2=0}{\pderiv{}{a_0}\norm{e}^2=0}$. The result using multiroot from the R package rootSolve is $\opair{N_*}{a_0}=\opair{255.8436595023}{0.6539203544}$. This solution is "better" than Method 1 solution in the sense that it results in a lower cost: $$\begin{array}{|c|rcl|} \text{Method} & cost(N_0,N_*,a_0) \\\hline \\1 & cost(18,252,0.6631183) &=& 31.32307 \\2 & cost(18,255.8436595023,0.6539203544) &=& 0.7140973 \end{array}$$

Some R code supporting Reproducible Research:

 require(stats);
 require(R.utils);
 require(rootSolve);
 rm(list=objects());

#---------------------------------------
# Data
#---------------------------------------
 tdata = c(0:10)
 ydata = c(18,33,56,90,130,170,203,225,239,247,251)
 t     = seq( from=min(tdata), to=max(tdata), length=1000 )

#---------------------------------------
# Estimate Function N(t)
#---------------------------------------
 N0 = ydata[1]
 N = function(t,N0,Nh,a0)
 {
   result = Nh / ( 1 + (Nh/N0-1)*exp(-a0*t) )
 }

#---------------------------------------
# Cost Function
#---------------------------------------
 cost = function(N0,Nh,a0)
 {
   summ = 0;
   for (i in c(1:length(tdata)))
   {
     summ = summ + ( N(tdata[i],N0,Nh,a0) - ydata[i] )^2
   }
   result = summ
 }

#---------------------------------------
# Partial derivative with respect to a0 of Cost Function
#---------------------------------------
 Pcosta0 = function(N0, Nh, a0)
 {
   summ = 0;
   for (i in c(1:length(tdata)))
   {
     summ = summ + ( N(tdata[i],N0,Nh,a0) )^2 *
                   ( N(tdata[i],N0,Nh,a0) - ydata[i] ) *
                   ( tdata[i] * exp(-a0*tdata[i]) )
   }
   result = summ
 }

#---------------------------------------
# Partial derivative with respect to Nh of Cost Function
#---------------------------------------
 PcostNh = function(N0, Nh, a0)
 {
   summ = 0;
   for (i in c(1:length(tdata)))
   {
     summ = summ + ( 1 - exp(-a0*tdata[i]) ) *
                   ( N(tdata[i],N0, Nh, a0) )^2 *
                   ( N(tdata[i],N0, Nh, a0)-ydata[i] )
   }
   result = summ
 }

#---------------------------------------
# Partial derivative vector of cost
#---------------------------------------
Pcost = function(x)
{
   N0 = 18
   Nh = x[1]
   a0 = x[2]
   F1 = Pcosta0( N0, Nh, a0 );
   F2 = PcostNh( N0, Nh, a0 );
   result = c(F1, F2);
}

#---------------------------------------
# Calculate roots
#---------------------------------------
 Roots = multiroot( f=Pcost, start=c(ydata[11-1],0.4) );
 Nh = Roots$root[1]
 a0 = Roots$root[2]

#---------------------------------------
# Display
#---------------------------------------
 printf("(N0, Nh, a0) = (%.2f, %.10f, %.10f) with estim.precis=%.2e\n", N0, Nh, a0, Roots$estim.precis )
 colors = c( "red" , "blue" );
 traces = c( "N(t)", "data" );
 plot ( t , N(t, N0, Nh, a0), col=colors[1], lwd=2, type='l', xlab="t", ylab="y", ylim=c(0,max(ydata)+10) )
 lines( tdata, ydata        , col=colors[2], lwd=5, type='p' )
 legend("topleft", legend=traces, col=colors, lwd=3, lty=1:1)
 grid()

user122283 On 28 Jan 2021 - 11:31

Method 3: Three degrees of freedom

$\newcommand{\eqd}{\triangleq}$ $\newcommand{\brp}[1]{{\left(#1\right)}}$ $\newcommand{\brs}[1]{{\left[#1\right]}}$ $\newcommand{\norm}[1]{\left\lVert #1 \right\rVert}$ $\newcommand{\pderiv}[2]{{\frac{\partial#1}{\partial#2} }}$ $\newcommand{\opair}[2]{\left( #1,#2\right)}$ $\newcommand{\otriple}[3]{\left( #1,#2,#3\right)}$ $\newcommand{\R}{\Bbb{R}}$ $\newcommand{\esth}{N}$

Method 1 optimized with respect to $a_0$ and Method 2 with respect to $\opair{N_*}{a_0}$. Both methods assume $N_0\eqd y_n[0] \eqd 18$. However, often in data modeling, it is assumed that measured data is the actual true value being measured plus some measurement noise. As such, it is highly unlikely that the true height of the plant at time $t=0$ is exactly $y_n[0]=18.000\ldots$, but rather $18+n(0)$, where $n(0)$ is a random noise variable at time $t=0$.

Method 1 gave a lemma for $\pderiv{}{a_0}N(t)$ and Method 2 for $\pderiv{}{N_*}N(t)$; here is a similar lemma for $\pderiv{}{N_0}N(t)$: $$\begin{align*} \boxed{\pderiv{}{N_0}\esth(t)} &\eqd \pderiv{}{N_0 }\brs{\frac{N_*}{1+\brp{\frac{N_*}{N_0}-1}e^{-a_0 t}}} && \text{by definition of $\esth(t)$} \\&= \frac{ - N_*\brp{\frac{-1}{N_0^2}}e^{-a_0 t}} {\brs{1+\brp{\frac{N_*}{N_0}-1}e^{-a_0 t}}^2} && \text{by Quotient Rule} \\&= \brs{ \frac{e^{-a_0 t}}{N_*N_0^2}} \brs{\frac{N_*}{1+\brs{\frac{N_*}{N_0}-1}e^{-a_0 t}}}^2 \\&= \brs{ \frac{e^{-a_0 t}}{N_*N_0^2}} \esth^2(t) && \text{by definition of $\esth(t)$} \end{align*}$$

$$\begin{align*} \boxed{0} &= \frac{N_*N_0^2}{2}\cdot0 \\&=\frac{N_*N_0^2}{2} \pderiv{}{N_0}\norm{e}^2 \\&\eqd \frac{N_*N_0^2}{2} \pderiv{}{N_0}\sum_{n=0}^{n=10}e^2(t_n) && \text{by definition of $\norm{\cdot}$} \\&\eqd \frac{N_*N_0^2}{2} \pderiv{}{N_0}\sum_{n=0}^{n=10}\brs{\esth(t_n)-y_n}^2 && \text{by definition of $e$} \\&= \frac{N_*N_0^2}{2} \sum_{n=0}^{n=10}2\brs{\esth(t_n)-y_n} \pderiv{}{N_0}\esth(t_n) && \text{by Chain Rule} \\&= \frac{N_*N_0^2}{2} \sum_{n=0}^{n=10}2\brs{\esth(t_n)-y_n} \brs{ \frac{e^{-a_0 t_n}}{N_*N_0^2}}\esth^2(t_n) && \text{by lemma} \\&= \boxed{\sum_{n=0}^{n=10} \esth^2(t_n)\brs{\esth(t_n)-y_n} e^{-a_0 t_n} } \end{align*}$$

So now we have a three equation triple $\otriple{\pderiv{}{N_h}\norm{e}^2=0}{\pderiv{}{N_0}\norm{e}^2=0}{\pderiv{}{a_0}\norm{e}^2=0}$ in a three variable triple $\otriple{N_*}{N_0}{a_0}$. The result using multiroot from the R package rootSolve is $\otriple{N_0}{N_*}{a_0}=\otriple{18.1994673377}{256.0554740324}{0.6508899900}$. This solution is "better" than both Methods 1 and 2 in the sense that it results in a lower cost: $$\begin{array}{|c|rcl|} \text{Method} & cost(N_0,N_*,a_0) \\\hline \\1 & cost(18,252,0.6631183) &=& 31.32307 \\2 & cost(18,255.8436595023,0.6539203544) &=& 0.7140973 \\3 & cost(18.1994673377, 256.0554740324, 0.6508899900) &=& 0.4533383 \end{array}$$ Some R code supporting Reproducible Research:

#---------------------------------------
# packages
#---------------------------------------
#install.packages("stats");
#install.packages("R.utils");
#install.packages("rootSolve");
 require(stats);
 require(R.utils);
 require(rootSolve);
 rm(list=objects());

#---------------------------------------
# Data
#---------------------------------------
 tdata = c(0:10)
 ydata = c(18,33,56,90,130,170,203,225,239,247,251)
 t     = seq( from=min(tdata), to=max(tdata), length=1000 )

#---------------------------------------
# Estimate Function N(t)
#---------------------------------------
 N0 = ydata[1]
 N = function(t,N0,Nh,a0)
 {
   result = Nh / ( 1 + (Nh/N0-1)*exp(-a0*t) )
 }

#---------------------------------------
# Cost Function
#---------------------------------------
 cost = function(N0,Nh,a0)
 {
   summ = 0;
   for (i in c(1:length(tdata)))
   {
     summ = summ + ( N(tdata[i],N0,Nh,a0) - ydata[i] )^2
   }
   result = summ
 }

#---------------------------------------
# Partial derivative with respect to a0 of Cost Function
#---------------------------------------
 Pcosta0 = function(N0, Nh, a0)
 {
   summ = 0;
   for (i in c(1:length(tdata)))
   {
     summ = summ + ( N(tdata[i],N0,Nh,a0) )^2 *
                   ( N(tdata[i],N0,Nh,a0) - ydata[i] ) *
                   ( tdata[i] * exp(-a0*tdata[i]) )
   }
   result = summ
 }

#---------------------------------------
# Partial derivative with respect to Nh of Cost Function
#---------------------------------------
 PcostNh = function(N0, Nh, a0)
 {
   summ = 0;
   for (i in c(1:length(tdata)))
   {
     summ = summ + ( 1 - exp(-a0*tdata[i]) ) *
                   ( N(tdata[i],N0, Nh, a0) )^2 *
                   ( N(tdata[i],N0, Nh, a0) - ydata[i] )
   }
   result = summ
 }

#---------------------------------------
# Partial derivative with respect to N0 of Cost Function
#---------------------------------------
 PcostN0 = function(N0, Nh, a0)
 {
   summ = 0;
   for (i in c(1:length(tdata)))
   {
     summ = summ + ( exp(-a0*tdata[i]) ) *
                   ( N(tdata[i],N0, Nh, a0) )^2 *
                   ( N(tdata[i],N0, Nh, a0) - ydata[i] )
   }
   result = summ
 }

#---------------------------------------
# Partial derivative vector of cost
#---------------------------------------
Pcost = function(x)
{
   N0 = x[1]
   Nh = x[2]
   a0 = x[3]
   F1 = Pcosta0( N0, Nh, a0 );
   F2 = PcostNh( N0, Nh, a0 );
   F3 = PcostN0( N0, Nh, a0 );
   result = c(F1, F2, F3);
}

#---------------------------------------
# Calculate roots
#---------------------------------------
 Roots = multiroot( f=Pcost, start=c(ydata[1], ydata[11], 0.6) );
 N0 = Roots$root[1]
 Nh = Roots$root[2]
 a0 = Roots$root[3]

#---------------------------------------
# Display
#---------------------------------------
 printf("(N0, Nh, a0) = (%.10f, %.10f, %.10f) with estim.precis=%.2e\n", N0, Nh, a0, Roots$estim.precis )
 colors = c( "red" , "blue" );
 traces = c( "N(t)", "data" );
 plot ( t , N(t, N0, Nh, a0), col=colors[1], lwd=2, type='l', xlab="t", ylab="y", ylim=c(0,max(ydata)+10) )
 lines( tdata, ydata        , col=colors[2], lwd=5, type='p' )
 legend("topleft", legend=traces, col=colors, lwd=3, lty=1:1)
 grid()

**user122283** · Accepted Answer

Method 1: One degree of freedom

$\newcommand{\eqd}{\triangleq}$ $\newcommand{\brp}[1]{{\left(#1\right)}}$ $\newcommand{\brs}[1]{{\left[#1\right]}}$ $\newcommand{\norm}[1]{\left\lVert #1 \right\rVert}$ $\newcommand{\pderiv}[2]{{\frac{\partial#1}{\partial#2} }}$ Have you considered using the Method of Least Squares to find $a_0$? For that method you can define an error function $$e(t_n)\eqd N(t_n)-y_n\eqd \frac{N_{*}}{1+\brs{N_{*}/N_0-1}e^{-a_0t_n}}-y_n$$ and a cost function (the error cost) as the norm squared of $e(t_n)$ as $$cost(a_0)\eqd\norm{e}^2\eqd\sum_{n=0}^{n=10}e^2(t_n)$$ You want to find the $a_0$ that minimizes that cost; that is, you want to find where (with respect to $a_0$) the cost function "goes to the lowest point". Suppose you set $N_0\eqd N(0)=18$ and $N_*\eqd252$ $\ldots$From the plot, it appears that $cost(a_0)$ is minimized around $a_0=0.65$.

To get a more accurate optimal $a_0$, we can differentiate $cost(a_0)$ with respect to $a_0$ and set the expression to $0$. To do that, this (lemma) may come in handy: $$\begin{align} \boxed{N'(t)} &\eqd \pderiv{}{a_0}N(t) \\&\eqd \pderiv{}{a_0}\brs{\frac{N_{*}}{1+\brp{\frac{N_*}{N_0}-1}e^{-a_0t}}} && \text{by definition of $N(t)$} \\&= \frac{0-N_*\brs{\frac{N_*}{N_0}-1}e^{-a_0t}(-t)}{\brp{1+\brs{\frac{N_*}{N_0}-1}e^{-a_0t}}^2} && \text{by Quotient Rule} \\&= \frac{N_*^2}{\brp{1+\brs{\frac{N_*}{N_0}-1}e^{-a_0t}}^2} \brs{\frac{1}{N_0}-\frac{1}{N_*}}te^{-a_0t} \\&\eqd \boxed{\brs{\frac{1}{N_0}-\frac{1}{N_*}}N^2(t)te^{-a_0t}} && \text{by definition of $N(t)$} \end{align}$$ Then$\ldots$ $$\begin{align} \boxed{0}&= \frac{1}{2\brp{\frac{1}{N_0}-\frac{1}{N_*}}}\cdot0 \\&=\frac{1}{2\brp{\frac{1}{N_0}-\frac{1}{N_*}}}\pderiv{}{a_0}\norm{e}^2 \\&\eqd \frac{1}{2\brp{\frac{1}{N_0}-\frac{1}{N_*}}}\pderiv{}{a_0}\sum_{n=0}^{n=10}e^2(t_n) && \text{by definition of $\norm{\cdot}$} \\&\eqd \frac{1}{2\brp{\frac{1}{N_0}-\frac{1}{N_*}}}\pderiv{}{a_0}\sum_{n=0}^{n=10}\brs{N(t_n)-y_n}^2 && \text{by definition of $e$} \\&= \frac{1}{2\brp{\frac{1}{N_0}-\frac{1}{N_*}}}\sum_{n=0}^{n=10}2\brs{N(t_n)-y_n}N'(t_n) && \text{by Chain Rule} \\&= \frac{1}{2\brp{\frac{1}{N_0}-\frac{1}{N_*}}}\sum_{n=0}^{n=10}2\brs{N(t_n)-y_n}\brs{\frac{1}{N_0}-\frac{1}{N_*}}N^2(t_n)te^{-a_0t_n} && \text{by (lemma)} \\&= \boxed{\sum_{n=0}^{n=10}N^2(t_n)\brs{N(t_n)-y_n}t_ne^{-a_0t_n}} \\&\eqd Dcost(a_0) && \text{(call the sum $Dcost(a_0)$)} \end{align}$$ Plotting $Dcost$ with respect to $a_0$, it appears that $Dcost(a_0)$ crosses $0$ at around $a_0=0.66$:

The uniroot function from the R stats package indicates that $Dcost(a_0)$ crosses $0$ at $a_0=0.6631183$ with estim.prec=6.103516e-05.

Using $N_0\eqd18$, $N_*\eqd252$, and $a_0\eqd0.6631183$, $N(t)$ seems to fit the 11 data points fairly well ($cost(0.6631183)=31.32307$) $\ldots$

Some R code supporting Reproducible Research:

#---------------------------------------
# packages
#---------------------------------------
#install.packages("stats");
 require(stats);
 rm(list=objects());

#---------------------------------------
# Data
#---------------------------------------
 tn = c(0:10)
 yn = c(18,33,56,90,130,170,203,225,239,247,251)
 t  = seq( from=min(tn), to=max(tn), length=1000 )

#---------------------------------------
# Estimate Function N(t)
#---------------------------------------
 N0=18
 Nh=252
 N = function(t,a) Nh / (1 + (Nh/N0-1)*exp(-a*t))

#---------------------------------------
# Cost Function
#---------------------------------------
 cost = function(x,tn,yn) 
 {
  #sum((N(tn,x)-yn)^2)
   summ = 0;
   for (i in c(1:11))
   { 
     summ = summ + (N(tn[i],x)-yn[i])^2
   }
   result = summ
 }

#---------------------------------------
# Partial derivative of Cost Function
#---------------------------------------
 Dcost = function(x) 
 {
   summ = 0;
   for (i in c(1:11))
   { 
     summ = summ + (N(tn[i],x))^2 * (N(tn[i],x)-yn[i]) * tn[i]*exp(-x*tn[i])
   }
   result = summ
 }

#---------------------------------------
# Find 0 crossing
#---------------------------------------
 x=seq(from=0.3, to=0.9, length=1000)
 aRoot = uniroot( Dcost, c(0.3, 0.9) )
 a0 = aRoot$root

#---------------------------------------
# Graphics
#---------------------------------------
 colors = c( "red" , "blue", "orange", "green"    );
 traces = c( "N(t)", "data", "cost"  , "Dcost"    ); 
 plot ( t , N(t, a0)      , col=colors[1], lwd=2, type='l', xlab="t", ylab="y", ylim=c(0,max(yn)+10) ) 
 lines( tn, yn            , col=colors[2], lwd=3, type='p' )
 lines( t , Dcost(t)      , col=colors[4], lwd=2, type='l' )
 lines( t , cost(t,tn,yn) , col=colors[3], lwd=2, type='l' )
 legend("topleft", legend=traces, col=colors, lwd=3, lty=1:1)
 grid()

Using generalized logistic curve to create a mathematical model from data.

There are 4 best solutions below

Method 1: One degree of freedom

Method 2: Two degrees of freedom

Method 3: Three degrees of freedom

Related Questions in STATISTICS

Related Questions in FUNCTIONS

Related Questions in MATHEMATICAL-MODELING

Related Questions in LOGISTIC-REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions