I wanted to implement some penalized regression parameter estimation algorithm by Fan&Li (http://sites.stat.psu.edu/~rli/research/penlike.pdf, section 3.3, [1]), but cannot catch the idea of some details. More specifically, I don't quite understand how local quadratic approximation is derived.
Assume that $p(x)$ is concave penalty function s.t. $p(0)=0$ and $p$ is not differentiable at origin. It is stated that given initial value $x_0$ we can approximate
$\left[p(|x|)\right]'=p'(|x|)\text{sgn}(x)\approx \lbrace{p'(|x_0|)/|x_0|\rbrace}x$, $(1)$
when $x\neq 0$ (actually it should be that both $x,x_0\neq 0$, shouldn't it?) and the approximation leads to
$p(|x|)\approx p(|x_0|)+\tfrac{1}{2}\lbrace{p'(|x_0|)/|x_0|)\rbrace}(x^2-x_0^2)$, for $x\approx x_0$. $(2)$
I think that $(1)$ follows from assumption that if $x\approx x_0$ then
$p'(|x|)\text{sgn}(x)/x\approx p'(|x_0|)/|x_0|$.
But how one gets to $(2)$? Why the second term in Taylor approximation is lost and how is second derivative approximated?
[1]: Jianqing Fan, Runze Li. Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties. Journal of the American Statistical Association, December 2001, vol 96 no 456, pp.1348-1360.