Reading Definition 1.1 in Information theory and its applications (2016), by Amari, Shun-ichi, the following is the definition of a divergence, that is, an asymmetric measure of distance $D$ on a manifold:
$1.~~~~D(P, Q) \ge 0$
$2. ~~~~D(P, Q) = 0 \text{ iff } P = Q$
$3.~~~~$ When $P$, $Q$ are sufficiently close, if, writing their coordinates as $$\xi_{P}, ~~~~~\xi_{Q} = \xi_{P} + d\xi$$ the divergence can be written as $$D(\xi_{P}, \xi_{P} + d\xi) = g_{ij}~d\xi^{i}d\xi^{j}$$
Meaning that the divergence is a quadratic form in the infinitesimal difference. Now I'm just wondering why there can be no linear term there, what problems would there be if a tentative divergence took the form, for instance:
$$D(\xi_{P}, \xi_{P} + d\xi) = g_{i}^{1}d\xi^{i} + g_{ij}^{2}d\xi^{i}d\xi^{j}?$$
Let $D$ be a divergence defined on a smooth manifold $M$ (i.e. it satisfies your properties 1 and 2). Suppose that $D$ is $C^1$. Fix some $p\in M$ and consider the function $\varphi_p(q)=D(p||q)$. Since $D$ is a divergence, $\varphi_p(p)=0$ and $\varphi_p(q)>0$ for all $q\neq p$, hence $p$ is a local minimum of $\varphi_p$. Since $\varphi_p$ is differentiable, its differential vanishes at $p$, i.e. $(d\varphi_p)_p=0$. So the first-order term must vanish. Moreover, if $D$ is $C^2$ the Hessian of $\varphi_p$ at $p$ is well-defined and positive definite. If $D$ is even $C^3$, $p\mapsto g_p=(\text{Hess}\,\varphi_p)_p$ defines a riemannian metric on $M$.
In short, if $D$ is sufficiently smooth to induce a riemannian metric, then the first-order term in the expansion must vanish.