In page 206 of the book 'Elements of statistical learning', the author wrote:
The local log-likelihood for this $J$ class model can be written
$\sum_{i=1}^NK_\lambda (x_0, x_i)\{\beta_{g_i0}(x_0) + \beta_{g_i}(x_0)^T(x_i-x_0) - log[1+\sum_{k=1}^{J-1}exp(\beta_{k0}+\beta_k(x_0)^T(x_i-x_0))]\}$
I can understand that the term $K_\lambda (x_0, x_i)$ is there to weight down the log-likelihood of each individual observation, but really don't know why the term $(x_i-x_0)$ instead of only $x_i$ - which I am expecting.
Is this just a typo or am I misunderstanding something?
It is not a typo. Notice that on the bottom of pg 206 the authors note that they are centering the local regressions at $x_0$. This is precisely why the term $x_i-x_0$ appears in the regression. The advantage of centering in this instance is that all the $\beta_k$ terms disappear when evaluating the posterior probability at $x=x_0$, leaving you only with a function of the $\beta_{k0}$'s.