I am part of a role playing game where we roll dice to set our statistics.
Our current system is to roll 4d6, reroll the lowest of the 4, then keep the highest 3 out of the original 3 and the new one you rolled.
I'm trying to convince the group that rolling 5d6 and keeping the highest 3 is the exact same thing.
Please help me statistically prove this, one way or the other. Preferably this would show the statistical probability of each possible outcome (3-18)
It may be easier to convince the DM and group that $B$ is equal to $A$ rather than $A$ is equal to $B$.
What I suggest is to start from the "simple" method of rolling five dice and picking out the top three.
Now we convince them that this is actually the same as the old procedure by doing it in the following fashion. Roll four of the dice from an open hand, and roll a fifth die in a cup, leaving the cup turned down on the die so no one can see the result (yet).
Go through the old procedure with the four dice that were hand-rolled. Thus, pick out one of four dice with the minimum value rolled and place it on top of the cup. Now "re-roll" that die by simply picking up the cup to reveal the fifth die, "as if" it were the re-rolled die.
Obviously the possible replacement of one of the top 3-out-of-4 first dice with that fifth die is exactly the old procedure, but it is as well the picking of the top 3-out-of-5 new procedure.
I computed the probabilities of the sum-of-top-3 values $3,\ldots,18$, for both methods, but it would be a little tedious to explain the computations to your group. This risks "proof by appeal to authority" when you really want them to be convinced by their own insight, but see my tabulation at the end of this post.
Another approach to convincing them would be in the nature of experiment, or a "battle of dice". Have the group pick sides according to which method they suspect has the best chance in a head-to-head comparison. In twelve battles where draws are discarded (so twelve competitions where one method or the other gives a higher sum on the best three dice), the equality of the two approaches means they have an equal chance of winning. The statistical distribution is then a binomial one, Old method vs. New method, so the wins should be nearly equal.
Just keep making them do battles of dice until everyone is bored and gives into your demand that they change to the simpler New method. Proof by exhaustion!
Addendum: Counts of Outcomes in Both Old and New Ways
The old way of obtaining a sum of three dice involved first rolling four dice, then re-rolling one of them (the one of least face value). The new way simply asks to roll five dice, then summing the top three of their face values.
In either way the probability of getting $X$ as the sum of the three "best" dice is most easily found a counting the number of ways $X$ can occur out of the possible $6^5 = 7776$ ways the five dice rolls happen. So the following table gives those counts for $X = 3,\ldots,18$, comparing both procedures. Since these (whole number) counts agree, the probabilities agree.
$$ \begin{array}{r|r|r} \text{Sum} & \text{New} & \text{Old} \\ \hline 3 & 1 & 1 \\ 4 & 5 & 5 \\ 5 & 15 & 15 \\ 6 & 41 & 41 \\ 7 & 90 & 90 \\ 8 & 170 & 170 \\ 9 & 296 & 296 \\ 10 & 470 & 470 \\ 11 & 665 & 665 \\ 12 & 881 & 881 \\ 13 & 1055 & 1055 \\ 14 & 1155 & 1155 \\ 15 & 1111 & 1111 \\ 16 & 935 & 935 \\ 17 & 610 & 610 \\ 18 & 276 & 276 \\ \hline \text{All} & 7776 & 7776 \end{array} $$
The implementation is deterministic, essentially nested loops to generate the outcomes of rolling four or five dice, sorting the values, and with an additional re-roll step for the lowest value in the "old" method. I used Amzi! Prolog to code this, but the task doesn't take particular advantage of backtracking logic, so an implementation in C or Python should be about as terse.