I'm trying to understand a calculation made in a paper (section 2 from the supplementary contents of Likelihood based observability analysis and confidence intervals for predictions of dynamic models. Kreutz C, Raue A and Timmer J. BMC Systems Biology 6, 120, 2012
) for analytically determining a maximum likelihood estimate for a simple differential equation:
$[1]$ $$\frac{dx}{dt}=-θ$$
where $x(0)=1$, we have a single data point $y=0.9$ for $x$ at time point $t=1$ (with assumed Gaussian measurement noise $N(0,0.1^2)$.
The solution to this equation is:
$[2]$
$$ x(t)=-exp(-θt) $$
The log-likelihood is:
$[3]$
$$ LL(y|θ)=log[\frac{1}{\sqrt{2*pi*sigma^2}}exp(\frac{-0.9 - exp(-θ))^2}{2*sigma^2})] $$
$[4]$
$$=-50(0.9-exp(-θ))^2+c $$
Where $ c = \frac{1}{\sqrt{2*pi*sigma^2}}$
$Question 1$: What steps were taken to get from equation $[3]$ to equation $[4]$?
The maximum likelihood estimate is obtained by:
$[5]$
$$\frac{\partial{LL(y|θ)}}{\partial{θ}}=0 $$
therefore (they used the if and only if
symbol in the paper`)
$[6]$
$$ -50θ(0.9 -exp(-θ))=0$$
and the estimated $θ$ is (they used the implies
arrow in the paper) :
$[7]$
$$-log(0.9)=0.1054 $$
$Question 2$: What steps were taken to get from $[5]$ to $[6]$? Where did the constant $c$ go? why is there $[5]$ no longer squared in $[6]$ and why is 50 multiplied by $θ$ in $[6]$?
$Question 3$: What steps were taken to get from $[6]$ to $[7]$?
I apologize if these questions seem a bit basic or obvious but I don't have a mathematical background and I have to try and understand this. Given that, it'd be great if you'd kindly be explicit in your answers. Thank you.
Q1: Use the property of the log function $\log (a\cdot b) = \log(a) + \log(b)$. You obtain \begin{equation} LL(y\vert \theta) = \log(c) + \log(\exp(\frac{(0.9-\exp(-\theta))^2}{2 \sigma})). \end{equation} Then, use the fact that the log function is the inverse of the exponential: $\log(\exp(a)) = a$.
Q2: In [4], the function $LL(y \vert \theta)$ is given. According to [5], the derivative of that expression to $\theta$ should be zero. Use the property of the derivative that the derivative of a sum is the sum of their derivatives: $\frac{\partial}{\partial \theta}\left[f(y \vert\theta) + g(y\vert\theta)\right] = \frac{\partial}{\partial \theta}f(y \vert\theta) + \frac{\partial}{\partial \theta}g(y \vert\theta)$. Then, you can use the property that the derivative of a constant is zero. In other words, $c$ does not depend on $\theta$, so the derivative of $c$ to $\theta$ is zero. To calculate the derivative to $\theta$ of \begin{equation} f(y\vert \theta) = -50(0.9 - \exp(-\theta))^2, \end{equation} use the fact that you can write $f(y \vert \theta)$ as \begin{equation} f(y\vert \theta) = F ( G (y\vert \theta)), \end{equation} with $F(x) = - 50 x^2$ and $G(y \vert \theta) = -0.9 - \exp(-\theta)$. Then, the chain rule for derivatives gives you the required answer. Note that there is a typo in [6]: it should say \begin{equation} -100\,\exp(-\theta)(0.9 - \exp (-\theta)) = 0. \tag{*} \end{equation}
Q3: Equation $(*)$ can be solved for $\theta$. Note that it has the form $a \cdot b = 0$, which means that $a=0$ or $b=0$. Now, $\exp(-\theta)$ can never be zero because an exponential can never be zero, so this means that \begin{equation} 0.9 - \exp(-\theta) = 0. \quad \Rightarrow \quad \exp(-\theta) = 0.9. \end{equation} Using the log function on both sides of the equation gives the required result.