Approximation of second derivative of $\exp(f(x))$ for $f(x) \ll 1$

190 Views Asked by At

Suppose $f(x)$ is a positive, "arbitrarily-nice" function over the reals, bounded such that $0<f(x)<M$. Suppose also that $M \ll 1$. I am interested in approximations of $\partial_{xx} e^{f(x)}$. Given that $f(x)$ is much less than one everywhere, I presume you can make the usual linear approximation for the exponential function, such that $\partial_{xx} e^{f(x)} \approx \partial_{xx} (1+f(x)) = \partial_{xx}f(x)$.

What I'm not sure about is how this approximation works if you expand the second derivative: $\partial_{xx} e^{f(x)} = e^{f(x)}[\partial_{xx} f(x) + (\partial_x f(x))^2]$. If the approximation above using $\exp(x)\approx1+x$ for small $x$ holds, why can we neglect the term proportional to the squared first derivative? For such a function $f(x)$, is there something we can say about the size of $|\partial_{xx}f|$ vs. $|\partial_{x}f|$?

3

There are 3 best solutions below

8
On

First, we need to show that if some $g\approx0,$ then $\partial_xg\approx0.$ \begin{eqnarray} \partial_xg&=&\lim_{h\to0}\frac{g(x+h)-g(x)}h\\ &\approx&\lim_{h\to0}\frac{0-0}h\\ &=&0 \end{eqnarray} Notice that the approximation $e^{f}\approx1+f$ assumes $f^2$ is negligible or \begin{eqnarray} f^2&\approx&0\\ f\partial_xf&\approx&0\\ f\partial_{xx}f+(\partial_xf)^2&\approx&0\\ (\partial_xf)^2&\approx&-f\partial_{xx}f. \end{eqnarray} Finally, we can show that the two forms of the second derivative are equivalent in the context of the approximation. \begin{eqnarray} \partial_{xx}e^{f}&=&e^{f}(\partial_{xx}f+(\partial_xf)^2)\\ &\approx&(1+f)(\partial_{xx}f-f\partial_{xx}f)\\ &=&(1-f^2)\partial_{xx}f\\ &\approx&\partial_{xx}f\\ &=&\partial_{xx}(1+f)\tag*{$\square$} \end{eqnarray}

4
On

It is not a good idea to approximate first and differentiate after that. As an example take $$ f(x)=\frac{M}{2}(\sin(Bx)+1)\\ f'(x)=\frac{MB}{2}\cos(Bx)\\ f''(x)=-\frac{MB^2}{2}\sin(Bx)\\ $$

Now lets define the two approximations as

$$ g_1(x)=f''(x)\\ g_2(x)=(f''(x)+(f'(x))^2)(1+f(x)) $$ and see what happens at points $x_0=\tfrac{n\pi}{B}$ for $n\in\mathbb{N}$ $$ |g_1(x_0)-(e^{f(x)})''|_{x=x_0}|=|g_1(x_0)-(e^{f(x_0)})(f''(x_0)+(f'(x_0))^2)|\\ =|0-\frac{M^2B^2}{4}|=\frac{M^2B^2}{4}\\ |g_2(x_0)-(e^{f(x)})''|_{x=x_0}|=|g_2(x_0)-(e^{f(x_0)})(f''(x_0)+(f'(x_0))^2)|\\ =|(1+f(x_0)-e^{f(x_0)})(f''(x_0)+(f'(x_0))^2)|\\ =0 $$

By choosing e.g. $B=1/M^2$ we now showed that the first approximation does not reduce the pointwise error if we take $M\to 0$, in fact the error blows up. So for such points the term $f'^2$ is clearly dominant over the other terms.

This is just one example but unfortunately, it is not really "special" to have functions where $f(x_0)$ and $f''(x_0)$ are both small but $|f'(x_0)|$ is not (for some points $x_0$).

Regarding your comment, an easy answer for a class of functions where the approximation $g_1$ is suitable is of course that you bound both the function and its derivative by something small e.g.: $\{f\in C^1|\quad (0<f(x)<M \land |f'(x)|<M )\quad\forall x\}$

0
On

To show what goes wrong, let's see what happens if you included a few more terms in your initial approximation, say $e^x \approx 1+x+\frac {x^2}2 + \frac {x^3}6$. Then if we insert $f(x)$ and differentiate twice, we would get $$ \frac {d^2}{dx^2} e^{f(x)} \approx 0 + f''(x) + \left[f''(x)f(x) + f'(x)^2\right] + \left[\frac12 f''(x)f(x)^2 + f'(x)^2f(x)\right]. $$ Here we see the problem with your naive approximation. Even if $f(x)$ is so small that we are comfortable with setting it to zero, we get an extra term from the first bracket above that doesn't disappear: $$ \frac {d^2}{dx^2} e^{f(x)}\approx f''(x) + f'(x)^2 \qquad \textrm{for} \qquad|f(x)|\approx 0. $$ We can't assume that $f'$ is small just because $f$ is small, so we must keep this extra term. We might even want to keep the first terms of $f$ (compare with how we normally say $e^x\approx 1+x$ and not just $e^x \approx 1$), which would give $$ \frac {d^2}{dx^2} e^{f(x)}\approx (f''(x) + f'(x)^2)(1+f(x)). $$

(Note how this is related to your own expression $\partial_{xx} e^{f(x)} = e^{f(x)}[\partial_{xx} f(x) + (\partial_x f(x))^2]$).

This is all to say that it would have worked much better if you included just one more term from the start. It shows that it pays off to have an awareness of where the approximation comes from, and in particular of the approximate magnitude of the error terms.

Actually, I want to make a point of that last bit; how could you have known that it wouldn't work? You should really think of the initial approximation as $e^x = 1+x+(\textrm{something like }x^2)$. (That last term is actually $O(x^2)$ in big O notation, but you don't necessarily need to know that). Then if you differentiated that expression twice, including the error term, you would notice the $f'(x)^2$ popping up, meaning that the error term doesn't get small with $f$, which would tell you that you needed a better approximation to start with.


Let's take a look at the general case as well. Let $g$ be a function with the power series $g(x) = \sum_{n\ge 0} a_n x^n$. Then, writing just $f$ in place of $f(x)$, we have

$$\begin{split} \frac {d^2}{dx^2} g(f(x)) &= \frac {d^2}{dx^2}\sum a_n f^n \\&= \frac {d}{dx}\sum n a_n f'f^{n-1} \\&= \sum \left[n a_n f''f^{n-1} + n(n-1\right) a_n (f')^2f^{n-2}] \\&= \sum \left[(n+1)a_{n+1} f'' + (n+2)(n+1)a_{n+2} (f')^2\right] f^n \\ &= \left[a_{1} f'' + 2a_{2}(f')^2\right] + \left[2a_{2} f'' + 6a_{3}(f')^2\right]f + \left[3a_{3} f'' + 12a_{4}(f')^2\right]f^2 + \ldots \end{split}$$ If $f(x)$ is small, it is reasonable to truncate the above series to any order we like. For quantitative error terms, we should invoke some form of Taylor's theorem.