I am a biologist and not a real mathematician. Hence some of the answers featured here are sometimes too complicated.
My question is:
I have set of 8 genes named PBX1,ESX1,PIM1,HBB,HBG,BCL11A,KLF4,GATA2
First I wanted to know how many different combinations of 6 I could make without caring about the order. I know now that this can be explained by:
(8x7x6x5x4x3)/(6x5x4x3x2x1) = 28 different combinations
However, the second problem is a bit different. I want to know within these sets of 6 genes how frequently subsets of 4 or 3 or 2 genes are represented (again without caring about the order).
for example: how often does the combination (PBX1,ESX1,PIM1,HBB) occur within these larger sets of 6.
I hope someone is able to help. Please consider my lack of real mathematical knowledge in your answering.
Thanks in advance!!
The labels just make things look more complicated than they are. Say the genes are $(g_1, g_2, \cdots, g_8)$. Then, as you say, the number of ways to choose an ordered collection of $6$ of these is $$\binom 86=28$$
Having singled out, say, $g_1, g_2, g_3, g_4$ and insisting that that these be part of your collection, we now just have to choose $2$ from $g_5, g_6, g_7, g_8$. The number of ways to do that is $$\binom 42=6$$
In general, if you select $i≤6$ genes and require them to be in your collection of $6$, then you have to choose $6-i$ from the remaining $8-i$ so the answer in that case would be $$\binom {8-i}{6-i}$$