Wikipedia (and different books too) seem to give two different definitions of what a regular conditional probability is. What is the correct definition and how do they relate? It seems to me that the first definition is the correct one, while the second is actually the definition of a regular conditional distribution?
Definition 1 can be found here.
Definition 1: Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space and $\mathsf{A}\in\mathcal{F}$. Let $\mathbb{1}_{\mathsf{A}}:\Omega\to\{0, 1\}$ be the indicator random variable. A conditional probability of $\mathsf{A}$ given $\mathcal{G}$ is defined as a version of $\mathbb{E}[\mathbb{1}_{\mathsf{A}} \mid \mathcal{G}]$ and denoted $\mathbb{P}(\mathsf{A} \mid \mathcal{G})$ $$ \int_{G} \mathbb{P}(\mathsf{A}\mid \mathcal{G}) d\mathbb{P} = \mathbb{P}(\mathsf{A}\cap G) \qquad \forall \, G\in\mathcal{G}. \qquad \qquad \qquad (1) $$ A conditional probability is said to be regular if $\mathbb{P}(\cdot \mid \mathcal{G})(\omega)$ is a probability measure for any $\omega\in\Omega$. Essentially it is a markov kernel satisfying condition $(1)$.
Definition 2 can be found here. This involves an additional random variable $X$.
Definition 2: Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space, $(E, \mathcal{E})$ be a measurable space, and $X:\Omega\to E$ be a random variable with distribution $\mathbb{P}_X = X_*\mathbb{P}$. Let $\nu:\Omega\times\mathcal{E}\to [0, 1]$ be a markov kernel satisfying $$ \mathbb{P}(\mathsf{A}\cap X^{-1}(\mathsf{B})) = \int_{\mathsf{B}} \nu(x, \mathsf{A}) \,d \mathbb{P}_X(x). $$
In general, I could not find anywhere the difference or relationship between regular conditional probability and regular conditional distribution
The second one is a more general definition. If you take $X$ to be the identity map from the measure space $(\Omega,\mathcal{F},\mathbb{P})$ to the measure space $(\Omega,\mathcal{G},\mathbb{P})$ you recover the first definition; with $\nu(x,A)=\mathbb{P}(A\,|\,\mathcal{G})(x)$, $\mathbb{P}$ a.e $x$.
Added:
(Distribution) Conditional distribution is a term that makes sense for a triple $(X,\mathcal{G},\mathbb{P})$, where $X$ is a random variable $X:(\Omega,\mathcal{F})\to (E,\mathcal{E})$, $\,\mathcal{G}\subset\mathcal{F}$ and $\mathbb{P}$ is a probability measure on $(\Omega,\mathbb{F})$. The $\mathbb{P}$-conditional distribution of the random variable $X$ is an assignment to each $B\in \mathcal{E}$, an RV represented $\mathbb{P}$ a.e by the function $\mathbb{E}[\mathbb{1}_B\circ X\,|\,\mathcal{G}]$.
(Density function) For $A\in\mathcal{F}$, $$m_A(B)=\int_\Omega \mathbb{1}_A\cdot\mathbb{E}[\mathbb{1}_B\circ X\,|\,\mathcal{G}]\;d\mathbb{P},$$ where $B\in\mathcal{E}$, is a measure on $(E,\mathcal{E})$. Denote the Radon-Nikodym derivative of $m_A$ w.r.t $X_\ast \mathbb{P}$, $\nu^\mathcal{G}(\cdot,A)$. Call $\mathbb{P}$-conditional probability density function of the random variable $X$ with respect to the measure $X_\ast \mathbb{P}$, the assignment to each $A\in \mathcal{F}$, of an RV represented $X_\ast\mathbb{P}$ a.e by the function $\nu^{\mathcal{G}}(\cdot,A)$ (the kernel in the question posted if $\mathcal{G}=\sigma(X)$).
So the conditional probability density function and conditional distribution carry the same information.
(Probability) Conditional probability makes sense for a pair $(\mathcal{G},\mathbb{P})$ where $\mathcal{G}\subset \mathcal{F}$ and $\mathbb{P}$ is a measure on $(\Omega,\mathcal{F})$. The $(\mathbb{P},\mathcal{G})$-conditional probability on $(\Omega,\mathcal{F})$ is the assignment of an RV represented $\mathbb{P}$ a.e by $\mathbb{E}[\mathbb{1}_A\,|\,\mathcal{G}]=:\mathbb{P}[A\,|\,\mathcal{G}]$, to each $A\in\mathcal{F}$.
They are extensions respectively of the notions of $\;$"distribution of a random variable", "probability density function of a random variable" and "probability" (with the trivial conditioning $\mathcal{G}=\{\emptyset,\Omega\}$) to general $\mathcal{G}\subset \mathcal{F}$.
As for all $B\in\mathcal{E}$, $$\int_\Omega \mathbb{1}_B\circ X\cdot\mathbb{E}[\mathbb{1}_A\,|\,\mathcal{G}]\;d\mathbb{P}=\int_\Omega \mathbb{1}_A\cdot\mathbb{E}[\mathbb{1}_B\circ X\,|\,\mathcal{G}]\;d\mathbb{P}=\int_E \mathbb{1}_B\cdot \nu^{\mathcal{G}}(x,A)\;d(X_\ast \mathbb{P})(x),$$ if $X=id:(\Omega,\mathcal{F})\to (\Omega,\mathcal{G})$, with $\mathcal{G}\subset\mathcal{F}$, then $$\nu^{\mathcal{G}}(\omega,A)=\mathbb{E}[\mathbb{1}_A\,|\,G](\omega),\; \mathbb{P} \;\text{a.e}\; \omega\in\Omega.$$