A basis vector in the tangent space at a point of a smooth manifold is given by a differential operator such as $\partial_{i}=\cfrac{\partial }{\partial \xi^{i}}$. On the other hand, in a statistical manifold consisting of probability distributions $p(x; \pmb{\xi})$ the basis vectors can be considered to be $\partial_{i} \log{p}$ which are not differential operators.
Vector fields are defined to be linear maps that satisfy the product rule. How come $X^{i}\partial_{i}\log{p}$ is considered to be a vector field on a statistical manifold? How do we reconcile the defining properties of a vector field with the choice of $\partial_{i} \log{p}$ as basis vectors?
I'm not at all an expert, but here's what I think I've understood from a little (not-so) light reading---correct me if I've gotten anything wrong.
Let $(X,\mu)$ be a $\sigma$-finite measure space and (by Jensen's inequality) let $$ \log\operatorname{PDF}(X,\mu) = \left\{c \in L^1(X,\mu) \mid \text{$c(x) \in \mathbb{R}$ for $\mu$-a.e. $x \in X$},\; \int_X \exp c \, d\mu = 1 \right\}, $$ which is bijective in the obvious way with the set of all $\mu$-a.e. positive PDFs on $(X,\mu)$. Then an $n$-dimensional statistical manifold is a subset $S$ of $\log\operatorname{PDF}(X,\mu)$ with the structure of an $n$-dimensional smooth manifold defined by a $C^\infty$ atlas $\{(\Xi_k,\ell_k)\}$, such that for every local chart $\ell_k : \Xi_k \to S$ (which you should think of as the pointwise logarithm $\ell_k = \log p_k$ of a map $p = \exp \ell$ from $\Xi_k \subset \mathbb{R}^n$ to the set of all $\mu$-a.e. positive PDFs on $(X,\mu)$),
However, what seems to get assumed but swept under the rug is that one usually wants a strengthened version of condition 1., namely, that each $\ell_k$ actually defines a smooth embedding of the open subset $\Xi_k$ of $\mathbb{R}^n$ into the Banach space $L^1(X,\mu)$, viewed trivially as a Banach manifold modelled on $L^1(X,\mu)$. This, then, implies that the inclusion of $S$ into $L^1(X,\mu)$ is itself a smooth embedding, i.e., that $S$ really is an $n$-dimensional submanifold of $L^1(X,\mu)$. Thus, as far as I can tell, you should be able to rewrite this slightly strengthened definition of $n$-dimensional statistical manifold in the following way:
Now, given the trivial Banach manifold structure on $L^1(X,\mu)$, one can canonically identify $T_\phi L^1(X,\mu)$ with $L^1(X,\mu)$ for any $\phi \in L^1(X,\mu)$ (just as one identifies $T_v \mathbb{R}^n$ with $\mathbb{R}^n$ for all $v \in \mathbb{R}^n$), so that the derivative $d\ell : TS \to TL^1(X,\mu)$ can be identified with a $L^1(X,\mu)$-valued $1$-form on $S$, allowing one, for instance, to define the Fisher metric as $$ \forall a, b \in \Gamma(TS), \; \forall s \in S \quad g(a,b)_s := \int_X d\ell(a)(s) d\ell(b)(s) \exp\ell(s) \, d\mu $$ the moment that $\ell$ is sufficiently well-behaved for $g(a,b)_s$ to always converge. In terms of a local chart $\ell_k : \Xi_k \to S \subset L^1(X,\mu)$ as in the first main paragraph, for each $\xi \in \Xi$, this simply boils down to $$ d(\ell_k)_\xi : \mathbb{R}^n \cong T_\xi \Xi \to T_{\ell_k(\xi)} S \subset T_{\ell_k(\xi)} L^1(X,\mu) \cong L^1(X,\mu), \quad v \mapsto v^i \partial_i \ell_k(\cdot,\xi); $$ in particular, we see that the partial derivatives $\partial_i \ell_k(\cdot,\xi)$ do live in $T_{\ell_k(\xi)}S$ when viewed as a real linear subspace of $L^1(X,\mu)$.