Bapat–Beg theorem gives the joint probability distribution of order statistics of independent but not necessarily identically distributed random variables in terms of the cumulative distribution functions.
I have troubles formulating the case of two order statistics $X_{(1)} \le X_{(2)}$ from the general one. Is this correct?
$$F_{X_{(1)}, X_{(2)}}(x_1, x_2) = \frac{P_{n_1}(x_1) + P_{n_2}(x_2)}{n_1! (n_2 - n_1)! \times (n_2 - n_2)! }$$
What you wrote doesn’t make much sense. On the left, you chose $k=2$, $n_1=1$ and $n_2=2$, but on the right you have $n_1$ and $n_2$ as free variables. (Also $(n_2-n_2)!=0!=1$.)
Substituting $k=2$, $n_1=1$ and $n_2=2$ into the definition in the Wikipedia article you linked to yields
\begin{eqnarray} F_{X_{(1)},X_{(2)}}(x_1,x_2) &=& \operatorname{Pr}\left(X_{(1)}\le x_1\land\operatorname{Pr}(X_{(2)}\le x_2\right) \\ &=& \sum_{i_2=2}^n\sum_{i_1=1}^{i_2}\frac{P_{i_1,i_2}(x_1,x_2)}{i_1!(i_2-i_1)!(n-i_2)!}\;, \end{eqnarray}
where
$$ P_{i_1,i_2}(x_1,x_2)= \operatorname{per}\, \begin{bmatrix} F_1(x_1) \;\cdots\; F_1(x_1) & F_1(x_2)-F_1(x_1) \;\cdots\; F_1(x_2)-F_1(x_1) & 1-F_1(x_2) \;\cdots\; 1-F_1(x_2) \\ F_2(x_1) \;\cdots\; F_2(x_1) & F_2(x_2)-F_2(x_1) \;\cdots\; F_2(x_2)-F_2(x_1) & 1-F_2(x_2) \;\cdots\; 1-F_1(x_2)\\ \vdots & \vdots & \vdots \\ \underbrace{F_n(x_1) \;\cdots\; F_n(x_1) }_{i_1} & \underbrace{F_n(x_2)-F_n(x_1)\;\cdots\;F_n(x_2)-F_n(x_1)}_{i_2-i_1} & \underbrace{1-F_n(x_2) \;\cdots\; 1-F_n(x_2) }_{n-i_2} \end{bmatrix}\;. $$
I’m not sure whether you also intended to set $n=2$. In that case, the result is
$$ F_{X_{(1)},X_{(2)}}(x_1,x_2) =\sum_{i=1}^2\frac{P_{i,2}(x_1,x_2)}i\;, $$
where
$$ P_{i,2}(x_1,x_2)= \operatorname{per}\, \begin{bmatrix} F_1(x_1) \;\cdots\; F_1(x_1) & F_1(x_2)-F_1(x_1) \;\cdots\; F_1(x_2)-F_1(x_1)\\ \underbrace{F_2(x_1) \;\cdots\; F_2(x_1)}_i & \underbrace{F_2(x_2)-F_2(x_1) \;\cdots\; F_2(x_2)-F_2(x_1)}_{2-i} \end{bmatrix}\;. $$
Writing out the sum yields
\begin{eqnarray} F_{X_{(1)},X_{(2)}}(x_1,x_2) &=&F_1(x_1)(F_2(x_2)-F_2(x_1))+F_2(x_1)(F_1(x_2)-F_1(x_1))+F_1(x_1)F_2(x_1) \\ &=& F_1(x_1)F_2(x_2)+F_2(x_1)F_1(x_2)-F_1(x_1)F_2(x_1)\;. \end{eqnarray}
That makes sense, as it adds the two ways in which the order statistics can be related to the variables ($X_{(1)}=X_1$ and $X_{(2)}=X_2$ or $X_{(1)}=X_2$ and $X_{(2)}=X_1$) and then subtracts the double-counted contribution where both variables are below $x_1$.