In short: how to obtain the average velocity from the Fokker-Planck equation in the overdamped regime? (i.e. when the probability density is $P(\mathbf{x},t)$ and not $P(\mathbf{x},\mathbf{v},t)$, otherwise we could just consider the first moment of the variable $\mathbf{v}$).
Background: the Langevin equation in the overdamped regime (ie. there is no $\ddot{\mathbf{x}}$) is
$$ \dot{\mathbf{x}}(t) = \mathbf{V}(\mathbf{x}(t)) + \boldsymbol{\eta}(t) \, $$
where $\mathbf{V}:\mathbb{R}^n\rightarrow \mathbb{R}^n$ is a smooth field and $\boldsymbol{\eta}$ is the usual white-noise term,
$$ \langle \eta_i(t) \eta_j(t') \rangle_{noise} = 2 D \delta_{ij} \delta(t-t') $$
The related Fokker Planck equation for the particle distribution $P(\mathbf{x},t)$ is the conservation equation for the total probability:
$$ \partial_t P(\mathbf{x},t) = - \nabla \cdot ( \mathbf{J}_a +\mathbf{J}_d) $$
where
$$ \mathbf{J}_a(\mathbf{x},t) = P(\mathbf{x},t) \mathbf{V}(\mathbf{x}) \\ \mathbf{J}_d(\mathbf{x},t) = - D \, \nabla P(\mathbf{x},t) $$
are the "advection" and "diffusion" contributions to the total probability current.
Question: considering the Langevin ODE for many particles or the Fokker-Planck PDF should be equivalent, at least in the limit of many particles (i.e. many realizations of the Langevin dynamics). How to get the average velocity of particles in the two descriptions (Langevin VS Fokker-Planck)?
Langevin: it seems natural to solve the ODE for $N$ different particles, with different initial conditions ${\mathbf{x}}_i(0)$ (say, uniformly distributed in the domain $\Omega$ at $t=0$) and different realizations of the noise $\boldsymbol{\eta}$. The particles cannot leave $\Omega$, so that $N$ is constant. Hence, the average velocity is
$$\langle \dot{\mathbf{x}}(t) \rangle_N = N^{-1} \sum_{i=1..N} \dot{\mathbf{x}}_i(t) $$
Fokker-Planck: at $t=0$ we could choose a certain $P(\mathbf{x},0)$, say uniform (because in the Langevin picture the initial positions of the particles were uniformly distributed), $P(\mathbf{x},0) = 1/|\Omega|$, where $|\Omega|$ is the measure of the domain $\Omega$. Solving the Fokker-Planck equation gives $P$ at later times, $P(\mathbf{x},t)$. Which is the average velocity of particles? For large $N$, do we have that
$$ \langle \dot{\mathbf{x}}(t) \rangle_N \approx \int_\Omega d^nx \, \mathbf{J}_a(\mathbf{x},t) = \int_\Omega d^nx \, P(\mathbf{x},t) \mathbf{V}(\mathbf{x}) $$
or do we have to consider the full probability current
$$ \langle \dot{\mathbf{x}}(t) \rangle_N \approx \int_\Omega d^nx \, (\mathbf{J}_a(\mathbf{x},t)+\mathbf{J}_d(\mathbf{x},t)) \, \, ? $$
Assume you are farmiliar with Ito calculus and stochastic differential equation, the standard method in mathematics to deal with dynamics with white-noise like Langevin.
1. Overdamped Langevin dynamics differs essentially from standard Lagevin dynamics. In particular, it could not be regarded as simply dropping the inertia term.
The difference can be seen as follows.
Consider the standard Langevin dynamics, which is usually put as follows in physics: $$ \mu\,\ddot{x}(t)=-\dot{x}(t)-\nabla\phi(x(t))+\sqrt{2D}\,\eta(t), $$ and shall be put as follows in mathematics: \begin{align} {\rm d}x_t&=v_t\,{\rm d}t,\\ \mu\,{\rm d}v_t&=-v_t\,{\rm d}t-\nabla\phi(x_t)\,{\rm d}t+\sqrt{2D}\,{\rm d}W_t, \end{align} where $\mu=m/\gamma$ is the reduced mass, $\phi=\Phi/\gamma$ is the scaled potential, $\eta(t)$ denotes the normalized white noise, $D$ is the diffusion constant, and $W_t$ is the Wiener process (i.e., standard Brownian motion).
In the overdamped regime, one assumes $\mu\to 0^+$, and would thus expect that the standard Langevin dynamics reduces to, in physics, $$ \dot{x}(t)=-\nabla\phi(x(t))+\sqrt{2D}\,\eta(t), $$ or equivalently in mathematics, to \begin{align} {\rm d}x_t&=v_t\,{\rm d}t,\\ v_t\,{\rm d}t&=-\nabla\phi(x_t)\,{\rm d}t+\sqrt{2D}\,{\rm d}W_t. \end{align} By the first sub-equation, the reduced equations also writes \begin{align} {\rm d}x_t&=v_t\,{\rm d}t,\\ {\rm d}x_t&=-\nabla\phi(x_t)\,{\rm d}t+\sqrt{2D}\,{\rm d}W_t. \end{align} The second sub-equation here depicts the overdamped Langevin dynamics, also known as the Brownian dynamics.
However, by putting "overdamped regime", one should focus on not only the second sub-equation, but also the first one. Unfortunately, these two sub-equations contradict each other.
Therefore, it is not self-consistent by taking the overdamped Langevin dynamics as the standard Langevin dynamics with $\mu\to 0^+$.
2. Drift velocity and/or total kinetic energy, it really depends on what one truly wants.
Focus on the Brownian dynamics $$ {\rm d}x_t=-\nabla\phi(x_t)\,{\rm d}t+\sqrt{2D}\,{\rm d}W_t. $$ This single equation still makes sense. The question is: What is the proper definition of velocity for $x_t$ that solves this equation?
Consider two specific cases.
Back to the Brownian dynamics. When talking about its velocity, it really depends on what velocity one truly wants. If one wants the drift velocity only, then it should be $$ v_t=-\nabla\phi(x_t). $$ In this case, only $\mathbf{J}_a$ shall be included in using Fokker-Plank equation. By contrast, if one wants the total kinetic energy, then this must include both the part arising from the drift velocity, and the part contributed from the diffusion. In this case, both $\mathbf{J}_a$ and $\mathbf{J}_d$ shall be included.