Formula for the number of elements in exactly one of $n$ sets $$N_1 = S_1 - 2S_2 + 3S_3 - \ldots \pm nS_n$$ where $S_i$ is the sum of cardinalities of $i$-wise intersections except $S_1$ which is the sum of cardinalities of the given sets.
Below is an example problem that the formula above solves:
Find the number of five-letter words that use letters from a four-letter set in which exactly one letter is missing.
Solution: $972 - 384 + 12 = 600$
The solution above makes sense if we consider the set $\{x, y, z, w\}$ s.t. $A$ is the set of $5$-letter words that don't contain $x, \ B$ the set of $5$-letter words that don't contain $y, \ C$ the set of $5$-letter words that don't contain $z, \ D$ the the set of $5$-letter words that don't contain $w$ where we want to find $|\{\text{elements of $A$ that are not in $B, C, D$}\}| + |\{\text{elements of $B$ that are not in $A, C, D$}\}| + |\{\text{elements of $C$ that are not in $B, A, D$}\}| + |\{\text{elements of $D$ that are not in $A, B, C$}\}|$
My questions:
- Given the way $A, B, C, D$ are defined, if an element is in, say, $A$, then it cannot be in any one of $B, C, D.$ Similarly for the other three sets. This means these sets are pairwise disjoint implying the answer should be $|A| +|B| + |C| + |D| = 4\times 3^5 = 972.$ But that's not the given answer meaning some of these sets intersect. But where?
- Given the confusion above, now I am doubting we are looking for the number $|\{\text{elements of $A$ that are not in $B, C, D$}\}| + |\{\text{elements of $B$ that are not in $A, C, D$}\}| + |\{\text{elements of $C$ that are not in $B, A, D$}\}| + |\{\text{elements of $D$ that are not in $A, B, C$}\}|$ in the problem above. Is this the correct number whose value we want in the problem above? If not, why do we use the formula above to solve the given problem especially if $A, B, C, D$ are disjoint?
We can find the number of words in which exactly one letter is missing by subtracting the number of words in which at least two letters are missing from the number of words in which at least one letter is missing.
Number of words in which at least one letter is missing
$|A|$: If x is missing, then the five positions must be filled with y, z, or w. There are $3^5$ such words.
By symmetry, $$|A| = |B| = |C| = |D| = 3^5$$
The $3^5$ words that do not contain x include yyzzz (which is also missing w), yyyyw (which is also missing z), and wwwww (which is also missing x, y, and z). We have counted each word that is missing two letters twice, once for each way we could have designated one of the missing letters as the missing letter. We only want to count them once, so we must subtract them from the total.
$|A \cap B|$: These words include neither x nor y. Therefore, we must fill each of the five positions in the word with w or z. There are $2^5$ such words.
By symmetry, $$|A \cap B| = |A \cap C| = |A \cap D| = |B \cap C| = |B \cap D| = |C \cap D| = 2^5$$
However, if we subtract those words that are missing two letters from the total, we will have subtracted too much. This is because we first counted each word in which three letters are missing three times, once for each way we could have designated one of the three missing letters as the missing letter, then subtracted them three times, once for each of the $\binom{3}{2}$ ways we could have designated two of the three missing letters as the missing pair of letters. Therefore, we must add words from which three letters are missing to the total.
$|A \cap B \cap C|$: If x, y, and z are all missing, then each of the five positions must be filled with a w. This can happen in one way.
$|A \cap B \cap C \cap D|:$ It is not possible for all four letters to be missing from the word, so $$|A \cap B \cap C \cap D| = 0$$
By symmetry,
$$|A \cap B \cap C| = |A \cap B \cap D| = |A \cap C \cap D| = |B \cap C \cap D| = 1^5$$
Thus, by the Inclusion-Exclusion Principle, the number of words from which at least one letter is missing is \begin{align*} & |A \cup B \cup C \cup D|\\ & \quad = |A| + |B| + |C| + |D|\\ & \qquad - |A \cap B| - |A \cap C| - |A \cap D| - |B \cap C| - |B \cap C| - |B \cap D| - |C \cap D|\\ & \quad \qquad + |A \cap B \cap C| + |A \cap B \cap D| + |A \cap C \cap D| + |B \cap C \cap D|\\ & \qquad \qquad - |A \cap B \cap C \cap D|\\ & \quad = 4|A| - 6|A \cap B| + 4|A \cap B \cap C| - |A \cap B \cap C \cap D|\\ & \quad = 4 \cdot 3^5 - 6 \cdot 2^5 + 4 \cdot 1^5 \end{align*}
Number of words in which at least two letters are missing:
From above, we know that there are $$6|A \cap B| = 6 \cdot 2^5$$ words in which two letters are missing. However, we have counted each word in which three letters are missing three times, once for each of the $\binom{3}{2}$ ways we could have designated two of those three letters as the pair of missing letters. Therefore, we need to subtract twice the number of words in which three letters are missing from the total.
From above, we know that the number of words in which three letters are missing is $$4|A \cap B \cap C| = 4 \cdot 1^5$$
Thus, the number of words in which two letters are missing is $$6|A \cap B| - 2 \cdot 4|A \cap B \cap C| = 6 \cdot 2^5 - 8 \cdot 1^5$$
Number of words in which exactly one letter is missing:
Subtracting the number of words in which at least two letters are missing from the number of words in which at least one letter is missing gives: \begin{align*} 4|A| & - 6|A \cap B| + 4|A \cap B \cap C| - (6|A \cap B| - 8|A \cap B \cap C|)\\ & \quad = 4|A| - 12|A \cap B| + 12|A \cap C|\\ & \quad = 4 \cdot 3^5 - 12 \cdot 2^5 + 12 \cdot 1^5\\ & \quad = 972 - 384 + 12\\ & \quad = 600 \end{align*}