Pseudorandom permutation of 60000 elements with a long period

39 Views Asked by Bumbble Comm At 28 Mar 2026 - 7:36

I have a programming assignment that asks me to do mini-batch training. In particular, we are working with the MNIST dataset, which contains 60000 training samples. I would like to figure out the most efficient way to shuffle these images. The idea is to find a bijective hash (or permutation) $H$ on $\{0, 1, \cdots, 59999\}$ such that $X_{\text{shuffled}}[H[i]] = X_{\text{original}}[i]$ effective shuffles the dataset. In other words, $H$ maps the $i$-th element in the original dataset to the $H[i]$-th element in the shuffled dataset. Additionally, the permutation $H$ should have a long cycle, so that I won't get the same order every few shuffles. To clarify, I will do successive shuffles based on the current one, e.g.

\begin{aligned} X_{\text{shuffled}}[H[i]] =& X_{\text{original}}[i] \\ X_{\text{doubly shuffled}}[H[i]] =& X_{\text{shuffled}}[i] \\ X_{\text{triply shuffled}}[H[i]] =& X_{\text{doubly shuffled}}[i] \\ \cdots \end{aligned}

By "$H$ should have a long cycle", I mean I don't want to see something like $X_{\text{triply shuffled}} = X_{\text{original}}$.

I heard that I can let $H[i] = (A \times i) \bmod 60000$, where $A$ is an integer coprime with 60000. I picked $A = 999999000001$ with the hope that such a large prime can give me some randomness, but it just maps everything to themselves

>>> np.all(np.array([(999999000001 * i) % 60000 for i in range(60000)]) == np.arange(60000))
True

On the other hand, a small $A$ gives a more promising result, but it still does not seem very random

>>> np.sum(np.array([(11 * i) % 60000 for i in range(60000)]) == np.arange(60000))
10
>>> [(11 * i) % 60000 for i in range(60000)][-10:]
[59890, 59901, 59912, 59923, 59934, 59945, 59956, 59967, 59978, 59989]

There are other methods like doing a Fisher-Yates shuffle on the indices $0, 1, \cdots, 59999$ and using it as $H$, but I am not sure how that will work since essentially I am calling Fisher-Yates once and use its result successively.

How can I improve the randomness and period of $H$?

As a bonus, why is $(A \times i) \bmod 60000$ guaranteed to be bijective when $\operatorname*{GCD}(A, 60000) = 1$? I know nothing about number theory but I am curious.

Original Q&A

Pseudorandom permutation of 60000 elements with a long period

Related Questions in NUMBER-THEORY

Related Questions in PERMUTATIONS

Related Questions in SYMMETRIC-GROUPS

Related Questions in RANDOM

Related Questions in PERMUTATION-CYCLES

Trending Questions

Popular # Hahtags

Popular Questions