Given $n,k \in \mathbb{N}, \ k < n$, we assume that there is an optimal partition $P$ of $n$ elements into $k$ bins.
Given a random partition $Q$, what is the expectation value $E(r)$ of $r$, the number of elements placed in incorrect bins by a random partition. The order of the bins doesn't matter.
Background
The goal is to estimate the quality in clustering by comparing several clusterings generated by stochastic gradient descent. If they are highly correlated, then we have a good clustering. If they are not, then the elements would seem to be largely independent.
I would like to calculate the expectation value of $r$ for random clusterings in order to set a baseline for determining whether the algorithmically generated clusters do much better than random clusters.
What I've tried
I think that my trouble may have something to do with the bin order not mattering, and therefore the optimal pairing of bins for comparison between $P$ and $Q$ is not necessarily obvious.
Let's consider the number of partitions with a given $r$ value. There are $k!$ orderings for the bins. For each of these, we will have $(n-r)$ elements in the one correct bin for them, and $r$ elements in any of $(k-1)$ incorrect bins. The number of ways of doing this is
$$ k!\left(\begin{array}{c} n\\ r \end{array}\right)\left(k-1\right)^{r} $$
There are $k^n$ different assignments of elements to bins, so, according to my reasoning, the probability of the error rate being a given value $r$ is
$$\frac{k!\left(\begin{array}{c} n\\ r \end{array}\right)\left(k-1\right)^{r}}{k^{n}} $$
and we have
$$ E(r)=\sum_{r=0}^{r=n}r\left(\frac{k!\left(\begin{array}{c} n\\ r \end{array}\right)\left(k-1\right)^{r}}{k^{n}}\right) $$
This is problematic since, with the bins unordered, $r$ shouldn't reach $n$ (or tighter bounds depending on $n$ and $k$) since the bin order can be chosen so at least $1$ element is in the correct bin. However, the $r=n$ term in the sum above is non-zero, and contributes incorrectly to the $E(r)$. Besides, this formula very often produces $E(r)>n$, which is impossible. Dropping the $k!$ and using
$$ E(r)=\sum_{r=0}^{r=n}r\left(\frac{\left(\begin{array}{c} n\\ r \end{array}\right)\left(k-1\right)^{r}}{k^{n}}\right) $$
gives much more reasonable results, but I'm having trouble justifying ignoring permutations of the bins.
Another approach
Compare the graphs, $G(P)$ and $G(Q)$ with two nodes connected if they are in the same cluster. We might provide some measure of $r$ for a given partition by considering
$$R = G(P^c)\cap G(Q^c)$$
with $c$ denoting the complement. I'd like to explore the value of
$$\text{comp}(P,Q)=\frac{\#R}{\#P\cdot\#Q}$$
where $\#$ denotes cardinality, as akin to some product related to $r^2$ as a measure of variance in the partitions. There is complication in that this depends on both $r$ and the cluster sizes.