Explicit formula for tournament sequence

Question

Explicit formula for tournament sequence

440 Views Asked by Bumbble Comm At 23 Feb 2026 - 11:59

I am looking for an explicit formula for a sequence. The sequence is generated as follows:

There is a tournament with $10$ teams. In the beginning, all teams have a 0-0 win-loss record. The teams are paired up, and when the first round concludes, $5$ of the teams have a 0-1 record and $5$ of the teams have a 1-0 record. The second round is arranged so that only teams with the same win-loss record are allowed to face each other, and when there are an odd number of teams having a certain win-loss record, the extra team from one win-loss record plays the extra team from the nearby win-loss record. This means that for the second round, four of the 0-1 teams are paired up with each other, $4$ of the 1-0 teams are paired up with each other, and there is one pairing between a 0-1 team and a 1-0 team. We also assume that when there is a pairing across different win-loss records, the team with the better record always wins. This means that at the end of round 2, there will be $3$ teams with a 0-2 record, there will be $4$ teams with a 1-1 record, and $3$ teams with a 2-0 record.

I am looking for an explicit formula giving the win-loss record distribution at round $r$. I am also looking for a general formula for $T$ teams (I used "10" earlier just for illustration purposes).

I have python code which explicitly calculates this sequence for $T$ teams and round $r$, but I would like a mathematical expression for this.

I have an explicit formula for the first $4$ rows using ceiling/floor functions, but the formula does not work beyond $4$ rows. I also have an explicit formula giving the win-loss record in the long term, but it only works after about $T^2$ rows.

In terms of problem solving techniques that I've tried, I've tried various generating series approaches. For example, for the case of $10$ teams, I wrote round $0$ as

$$f_0(x,y) = y^0 + y^1 + y^2 + ... + y^9$$

and round $1$ as

$$f_1(x,y) = y^0 + y^1 + y^2 + y^3 + y^ 4 + (y^5 + y^6 + y^7 + y^8 + y^9)x$$

and round $2$ as

$$f_2(x,y) = y^0 + y^1 + y^2 + (y^3 + y^4 + y^5 + y^6)x + (y^7 + y^8 + y^9)x^2$$

and so on. I don't have a method for moving from $f_k$ to $f_{k+1}$, but I have lots of near misses.

I have also tried using a probabilistic approach. I have started from several invariants, and I have a function that when minimized gives a (sometimes) close approximation. The invariants are as follows:

The number of teams $T$ does not change from round to round.
The number of total wins at each round is always $T r / 2$.
The win-loss record is symmetric around $r/2$.

These invariants restrict the number of possible choices for round $r$. I then took the choice that most nearly matches a binomial distribution (using the euclidean distance as a metric, although any similarity metric could work). This metric isn't best, since I believe the win-loss records become "less binomial" over time.

Another thing I tried was various kinds of graphing. I can post some graphs if requested.

I am open to any approach, I look forward to seeing your ideas.

Updates and extra information

Here are the first few terms of the sequence for $T=10$:

$$10$$

$$5, 5$$

$$3, 4, 3$$

$$2, 3, 3, 2$$

$$1, 3, 2, 3, 1$$

$$1, 1, 3, 3, 1, 1$$

$$1, 0, 3, 2, 3, 0, 1$$

$$1, 0, 1, 3, 3, 1, 0, 1$$

$$1, 0, 0, 3, 2, 3, 0, 0, 1$$

$$1, 0, 0, 1, 3, 3, 1, 0, 0, 1$$

$$...$$

Another clarification is that the number of teams $T$ must be even.

Also, I would like to add the formula for the long-term behavior of the sequences. The long-term behavior shows a 2-term repetition, and this repetition depends on the value of $T \pmod 8$. Here are the four cases:

Case 1. ($T \equiv 0 \pmod 8$): $1444...444...441$, $3444...4444....443$

Case 2. ($T \equiv 2 \pmod 8$): $3444...424...443$, $1444...4334....441$

Case 3. ($T \equiv 4 \pmod 8$): $3444...444...443$, $1444...4444....441$

Case 4. ($T \equiv 6 \pmod{8}$): $1444...424...441$, $3444...4334....443$

Note that I have left off the preceding $1000...$ and the trailing $...0001$, since the number of zeros can be easily calculated. Not also that the number of $4$s can be calculated easily from the original number of teams $T$.

Regarding the generating series idea previously, one thing I found is that if we have for example

$$f_2(x,y) = y^0 + y^1 + y^2 + (y^3 + y^4 + y^5 + y^6)x + (y^7 + y^8 + y^9)x^2$$

then we can calculate something very close to $f_3(x,y)$ by performing the following steps:

$$\frac{f_2(x,y) + f_2(x,-y)}{2} = y^0 + y^2 + (y^4 + y^6)x + (y^8)x^2$$
$$\frac{f_2(x,y) - f_2(x,-y)}{2} \cdot x = y^1 x + (y^3 + y^5) x^2 + (y^7 + y^9) x^3$$
$$\widetilde{f}_3(x,y) = \frac{f_2(x,y) + f_2(x,-y)}{2} + \frac{f_2(x,y) - f_2(x,-y)}{2} \cdot x = y^0 + y^2 + (y^1 + y^4 + y^6)x + (y^3 + y^5 + y^8)x^2 + (y^7 + y^9)x^3$$

which when evaluated at $y = 1$ gives the correct sequence $2,3,3,2$. It is not exactly correct, since the $y$ terms are no longer ordered in terms of exponents, so this process is not repeatable unless we improve on it somehow.

Another Update

Let's add a variant to the original rules. The variant will be called "underdog", whereas the original rules will be called "overdog". The "overdog" rules stipulate that the team with the higher win-loss record will always win when it crosses over and faces a team with a lower win-loss ratio; in contrast, the "underdog" rules stipulate that the team with the lower win-loss ratio will win.

Let's call the overdog sequence with $T$ teams the sequence $O(T)$, and each term in the sequence will be identified as $O_{r,w}(T)$ where $r$ denotes the round and $w$ denotes the number of wins at that round. For the underdog variant, we define $U(T)$ and $U_{r,w}(T)$ in the same way.

I just discovered the following properties:

$$O(2^k + 2) = U(2^k) + O(2)$$

$$O(2^k + 4) = U(2^k) + U(2) + O(2)$$

This is helpful since $U(2^k)$ and $O(2^k)$ are easy to calculate for $k$ rows, and $O(2)$ and $U(2)$ are easily calculated for all rows.

I'm wondering if we can use the binary representation of $T$ and some kind of summing procedure to improve the number of rows that we can directly calculate. It would be fascinating to try to reduce $O(T)$ and $U(T)$ to the sum of terms of the form $U(2^K)$ or $O(2^K)$ for any arbitrarily large $K >> 0$. This would give us a way to calculate arbitrarily many rows for any $T$.

Correction to Previous Update

I should edit my previous update to be more precise. The powers of two were a bit of a red herring — all that matters is $T \pmod{4}$. Here are the updated relations:

$O(4k + 2) = U(4k) + O(2)$.
$U(4k + 2) = U(4k) + U(2)$.
$O(4k + 4) = U(4k) + U(2) + O(2)$.

This means that the problem is essentially reduced to finding $U(4k)$ instead of finding $U(2k)$. Not much of a reduction, but it's progress!

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2023-03-06 04:02:16

Here is a generating function approach that works: Let $f_r(x)$ be the generating function for win-loss records after $r$ rounds. Then: \begin{align*} f_0(x) &= 1\\ f_1(x) &= 1 + x\\ f_r(x) &= f_{r-1}(x)(1 + x)(1 + x^2)\ldots(1 + x^{r-1}) \end{align*} The coefficient of $x^k$ in $f_r(x)$ is the number of teams with $k$ wins after $r$ rounds. This works because in each round, teams are paired up with others of the same win-loss record (when possible), and the team with more wins always wins. So you're really just multiplying the generating function by $(1+x)$ shifted versions of itself. From this you can extract explicit formulas for the number of teams with $k$ wins after $r$ rounds by expanding the generating function and collecting coefficients.

Here are some additional thoughts:

The generating function approach gives a compact formula, but evaluating it to get explicit terms quickly becomes tedious. An alternative is to think about the process directly:

After $r$ rounds, consider teams with $k$ wins. These came from teams with $k-1$ wins that won their match, and teams with $k$ wins that lost their match (if possible). So you have a recurrence like: \begin{align*} a_{r,k} &= a_{r-1,k-1} + a_{r-1,k}\\ a_{r,0} &= a_{r-1,1}\\ a_{r,T/2} &= a_{r-1,T/2-1} \end{align*} where $a_{r,k}$ is the number of teams with $k$ wins after round $r$. This is easier to evaluate directly, and generalizes easily to other numbers of teams/wins.

The long-term behavior also has a nice explanation: after enough rounds, almost all matches are between teams with nearly equal win counts (close to $T/2$). In that case, wins and losses essentially balance out, and you just see an oscillation between the two closest win counts to $T/2$. The period of this oscillation depends on the parity of $T$ (mod 8) because that determines how the win counts are arranged around $T/2$.

Those recurrence relations are very helpful. A few more thoughts:

The binary representation of $T$ isn't too useful here, since the recurrences relate sequences for $4k+2$ and $4k+4$ to those for $4k$. The powers of 4 are really what matter, not the individual bits. You can extend the recurrences to get formulas for arbitrary $T$: \begin{align*} O(T) &= O((T-2)/4) + O(2) + (T \mod 4 = 2)O(2)\\ U(T) &= U((T-4)/4) + U(2) + (T \mod 4 = 2)(U(2) + U(2)) \end{align*} These let you calculate the sequences for any $T$, not just powers of 4. The long-term behavior is similar, but the period doubles from 4 to 8 when switching from overdog to underdog. This is because in the underdog case, teams with win counts close to $T/2$ can switch between the two closest counts after either winning or losing, so you get oscillations at half the rate.

**Bumbble Comm** · Answer 2 · 2023-03-07 22:03:29

(Not a solution. Too long to be a comment).

Let $f_n(x)$ be the generating function for performance after $n$ rounds.
We start with $f_0 = nx^0$, as all $n$ teams have won 0 matches.
Let $g_n(x) $ be $f_n(x)$ evaluated in $\mathbb{F}_2[x]$.
Let $H$ be a function that takes a polynomial with coefficients $\{0, 1\}$, an even number of them being 1, and returns:

Start with the term with highest exponent, multiply that term by $x$.
Multiply the term with the next highest exponent by $1$.
Multiply the term with the next highest exponent by $x$.
Multiply the term with the next highest exponent by $1$.
So on and so forth till we've gone through all the coefficients.

As an explicit example, $H(1+x^2+x^3+x^6) = 1 + 2x^3 + x^7$.
In a sense, $H(p(x)) \sim \frac{1+x}{2} p(x)$, but then we have bunch the coefficients so that they become integers.

With those definitions,

$$f_{n+1} (x) = \left(\frac{1+x}{2}\right) [f_n(x) - g_n(x) ] + H[g_n(x)].$$

The issue is the I don't know of a way to evaluate $H[g_n(x)]$, in order to attempt to find a closed form.

This gives us $f_n(x) \sim n(\frac{1+x}{2})^n$, which collaborates OP's observation that "it seems to be a binomial distribution, but gets less binomial over time".

In fact, if we let $H$ act on all polynomials by "multiplying by $(1+x)/2$ and then bunching coefficients accordingly", then $$f_{n+1}(x) = H[f_n(x)].$$ The above just reflects that $H$ acts very nicely on even coefficients.

Explicit formula for tournament sequence

There are 2 best solutions below

Related Questions in PROBABILITY

Related Questions in COMBINATORICS

Related Questions in NUMBER-THEORY

Related Questions in GENERATING-FUNCTIONS

Related Questions in INTEGER-SEQUENCES

Trending Questions

Popular # Hahtags

Popular Questions