Given a family $\mathcal{F}$ of subsets of some finite set $X$, its cover number of $\mathcal{F}$, $\text{Cov}(\mathcal{F})$, is the minimum number of members of $\mathcal{F}$ whose union covers all points of $X$.
Theorem 2.16. If each member of $\mathcal{F}$ has at most $a$ elements, and each point $x\in X$ belongs to at least $v$ of the sets in $\mathcal{F}$, then $$\text{Cov}(\mathcal{F})\leq \dfrac{|\mathcal{F}|}{v}(1+\ln a).$$
Proof. Let $|X|=N, |\mathcal{F}|=M$ and consider the $N\times M$ $0$-$1$ matrix $A=(a_{x,i})$, where $a_{x,i}=1$ iff $x\in X$ belongs to the $i$-th member of $\mathcal{F}$. By our assumption, each row of $A$ has at least $v$ ones and each column at most $a$ ones. By double couting, we have that $Nv\leq Ma$, or equivalently, $$\frac{M}{v}\geq \frac{N}{a} \quad \quad (*).$$ Our goal is to show that $A$ must contain an $N\times K $ submatrix $C$ with no all-$0$ rows and such that $$K\leq (M/v)(1+\ln a).$$
We describe a constructive procedure for producing the desired submatrix $C$. Let $A_a=A$ and define $A'_a$ to be any maximal set of columns from $A_a$ whose supports are pairwise disjoint and whose columns each have $a$ ones. Let $K_a=|A'_a|$. Discard from $A_a$ the columns of $A'_a$ and any row with a one in $A'_a$. We are left with a $k_a\times (M-K_a)$ matrix $A_{a-1}$, where $k_a=N-aK_a$. Clearly, the columns of $A_{a-1}$ have at most $a-1$ ones (indeed, otherwise such a column could be added to the previously discarded set, contradicting its maximality). We continue by doing to $A_{a-1}$ what we did to $A_a$. That is we define $A'_{a-1}$ to be any maximal set of columns from $A_{a-1}$, whose supports are pairwise disjoint and whose columns each have $a-1$ ones. Let $K_{a-1}=|A'_{a-1}|$, then discard from $A_{a-1}$ the columns of $A'_{a-1}$ and any row with a one in $A'_{a-1}$ getting a $k_{a-1}\times (M-K_a-K_{a-1})$ matrix $A_{a-2}$, where $k_{a-1}=N-aK_a-(a-1)K_{a-1}$.
The process will terminate after at most $a$ steps. The union of the columns of the discarded sets form the desired submatrix $C$ with $K=\sum \limits_{i=1}^a K_i$. The first step of the algorithm gives $k_a=N-aK_a$, which we rewrite, setting $k_{a+1}=N$, as $$K_a=\frac{k_{a+1}-k_a}{a}.$$ Analogously, $$K_i=\frac{k_{i+1}-k_i}{i} \ \text{for} \ i=1,\dots,a.$$ Now we derive an upper bound for $k_i$ by counting the number of ones in $A_{i-1}$ in two ways: every row of $A_{i-1}$ contains at least $v$ ones, and every column at most $i-1$ ones, thus $$vk_i\leq (i-1)(M-K_a-\dots-K_{i+1})\leq (i-1)M,$$ or equivalently $$k_i\leq \frac{(i-1)M}{v}.$$
So, $$K=\sum\limits_{i=1}^a K_i=\sum\limits_{i=1}^a \frac{k_{i+1}-k_i}{i}=$$ $$=\frac{k_{a+1}}{a}+\frac{k_{a}}{a(a-1)}+\frac{k_{a-1}}{(a-1)(a-2)}+\dots+\frac{k_{2}}{2\cdot 1}-k_1\leq $$ $$\leq \frac{N}{a}+\frac{M}{v}(\frac{1}{a}+\dots+\frac{1}{2})\leq \frac{N}{a}+\frac{M}{v}\ln a.$$ The last inequality here follows because $1+\frac{1}{2}+\dots+\frac{1}{n}$ is the $n$-th harmonic number which is known to lie between $\ln n$ and $\ln n+1$. Together with $(*)$, this yields $K\leq (M/v)(1+\ln a)$, as desired.
This proof is copied from the book of Stasys Jukna "Extremal combinatorics" and I would like to clarify some moments of this proof.
My questions:
I am a bit confused with the procedure and how it works. In the first step we need to find any maximal set of columns which have pairwise disjoint supports and each of them have $a$ ones. What if there no columns with $a$ ones?
Intuitively I can understand why this process terminates after at most $a$ steps but can anyone explain it in a rigorous way, please?
How we take the union of the columns of the discarded sets if those columns have different size?
If we take $C$ to be the union of the columns of the discarded sets. why any row of $C$ has $1$ among its entries?
I would be very thankful if someone can explain those questions! Thank you so much for your attention!