A set $C ⊆ E$ in a graph $G = (V, E)$ is called an edge cover if each vertex $v ∈ V$ is contained in at least one edge $e ∈ C$. Let us look for a small edge cover by a greedy algorithm: if there is an edge containing 2 uncovered vertices take an arbitrary such edge, otherwise take any edge covering some yet uncovered vertex, and repeat until all is covered. Show that the number of edges in a cover thus found is at most $\frac{3}{2}$ of the size of the smallest possible cover.
Hint from the book:
Let $e_1, . . . , e_k$ be the edges selected by the greedy algorithm and $ê_1, . . . , ê_t $ edges of some smallest edge cover. Let $k_1 = |\{i: e_i ∩ (e_1 ∪ · · · ∪ e_{i−1}) = ∅\}|$, and similarly $t_1 = |\{i: ê_i ∩ (ê_1 ∪· · · ∪ ê_{i−1}) = ∅\}|$. We have $|V | = k + k_1 = t + t_1$. Key observation: in the steps of the greedy algorithm for i > k1, at least one point of each edge $ê_j$ contributing to $t_1$ must have been covered, and hence $k_1 ≥ \frac{1}{2}t_1$. From this we get $k = t + t_1 − k_1 ≤ t + 1 2t_1 ≤ \frac{3}{2}t$.
Although the book essentially gives me the answer in the hint, I have difficulty understanding the construction of $k_1$ and $t_1$. I can justify why $|V| = k + k_1$ but I can't do the same for $t+t_1$. Additionally my justification is an attempt to make the hint work, and not intuitive at all. To put it simply, I wouldn't have thought of constructing sets $k_1$ and $t_1$ to solve this problem. I can't grasp the key observation in the hint and I am hoping through understanding of the construction of the sets it'd become clearer.
Thank you in advance for your help.
Also if you don't mind, I am trying to assess my on graph theory before I enroll in a more advanced class, and I'd appreciate it if you could do your best in rating the difficulty of this question from 1-10 (10 very difficult). I know rating questions is subjective and it might not tell me a lot but I'd still appreciate your input. Thank you.
The intuitive idea behind the construction is the following: Our algorithm yields a smaller edge cover when we add many edges where both vertices were not covered before adding the edge. $k_1$ is the number of such edges (edges that contributed 2 new vertices to the cover). Also $t_1$ is the number of such edges for the smallest set cover.
Now if there are $k_1$ edges that contribute $2$ new vertices each, then there are $k-k_1$ edges which contribute only $1$ new vertice each. In total all edges contribute $|V|$ new vertices. Thus $2k_1 + (k - k_1) = |V| = k + k_1$. The argument for $t, t_1$ is analogous.
I'll try to put the key observation in my own words: In our algorithm there is a point where we cannot find any edge which will contribute $2$ new vertices to the cover. This is exactly after we added the first $k_1$ edges. Now assume that $2k_1 < t_1$. In the "worst" case every one of our $k_1$ edges covers $2$ vertices from two different edges which are counted in $t_1$. But $2k_1 < t_1$ means that even if this worst case happens, then there must still be some edge where both vertices are not in our cover yet. But that's a contradiction to us not being able to find those edges. Thus $k_1 \geq 0.5t_1$.
I did not take any proper graph theory course yet(only CS courses focused on algorithms). I think this problem could have appeared in my courses as homework or exam question. Thus it probably is not too hard compared to problems of graph theory classes. But there is some creative element to it which might make it harder than expected.