I've recently been diving into a bit of machine learning, and lots of probability theory has been coming up as a result. I took one course on probability and statistics in my first year of university and never touched it again, so, needless to say, I'm a bit rusty.
To train a certain neural network, I first need to generate training data. The inputs take the form of binary strings $x$ of length $N$, so that the sample space is just $\{0, 1\}^N$. I would like to take a random sample so that there is roughly an equal number of strings with the same taxicab norm: $$\|x\|_1 = \sum_{i=1}^N x_i, \qquad \text{ where } x = (x_1, \dots, x_N) \in \{0, 1\}^N.$$ It's not too hard to work out that the number of binary strings of length $N$ with $\|x\|_1 = k$ is just $N \choose k$, and that there are $2^N$ binary strings of length $N$ in total, so this could be rephrased as finding the distribution such that the probability of selecting some string $X \in \{0, 1\}^N$ given that $\|X\|_1 = k$ is $$P(X \ \ \vert \ \ \|X\|_1 = k) = \frac{1}{N \choose k}.$$ From Bayes' Theorem, $$P(X) = \frac{P(\|X\|_1 = k)}{{N \choose k} P(\|X\|_1 = k \ \ \vert \ \ X)}.$$ By design, $P(\|X\|_1 = k) = \frac1{N+1}$. Moreover, $P(\|X\|_1 = k \ \ \vert \ \ X) = 1$, so that $$P(X) = \frac{1}{(N+1){N \choose k} }.$$ Is this argument correct? Does this distribution have some particular name?
Edit: By "binary string of length $N$," I really mean a vector in $\mathbb{R}^n$ whose components are all either $0$ or $1$.For $N=4$, there are $2^4 = 16$ possible binary strings of length $4$, some examples of which are: $$x_1 = (0, 0, 1, 0), \ x_2 = (1, 1, 0, 0), \ x_3 = (0, 1, 1, 1).$$ The taxicab metric is introduced to count the numbers of $1$s that appear in the string. In the above examples, we see that $$\|x_1\|_1 = 1, \ \|x_2\|_1 = 2, \ \| x_3\|_1 = 3.$$ An example of my goal in the case that $N=4$ would be to sample, say, $100$ points in such a way that there are $20$ strings containing no $1$s, $20$ strings containing one $1$, $20$ strings containing two $1$s, etc. (allowing for repetition).
NOTE. This answer pertains to the original version and 1st revision of OP's question, in which OP required both $\|X\|_1$ and $(X \mid \|X\|_1 = k)$ are uniformly distributed on their respective range.
Your original argument is correct, modulo some rectification:
This formula, although mathematically correct, does not directly tell us how to sample binary strings from the distribution of $X_N$ efficiently (without needing to tabulate the values of $\mathbf{P}(X_N = x)$ for all the $2^N$ possible values of $x$).
Thankfully, there is an efficient way of sampling from the distribution of $X_N$ via making a connection between the distribution of $X_N$ and a well-known stochastic process.
Below is a python code for simulating this process (assuming you have imported
numpylibrary asnp, as per the sacred tradition):