It might be a silly question. I got stuck there. For the context, see S.K.Donaldson's Riemann Surfaces, chapter 4, section 4.2.3.
Suppose $P(z,w)=a_0(z)+a_1(z)w+\dotsb+a_n(z)w^n\in\mathbb C[z,w]$ is an irreducible polynomial where $n>0$ and $a_n(z)\neq0$ for some $z\in\mathbb C$. Consider the zero set $X$ of $P$, and the projection map $\pi\colon X\to\mathbb C,(z,w)\mapsto z$. Let $S$ be the set of singular points of $X$, i.e. where $P=\partial P/\partial z=\partial P/\partial w=0$, and $F$ be the roots of $a_n(z)$. By implicit function theorem, $X\setminus S$ is a Riemann surface. Furthermore, set $S^+=\pi^{-1}(\pi(S)\cup F)$, we have $\pi(S)\cup F$ is finite therefore $S^+$ is finite. Now let $E=\pi(S)\cup F\cup\{\infty\}$, we obtain the projection map $\pi\colon X\setminus S^+\to S^2\setminus E$. It's a holomorphic map between Riemann surfaces.
It claims that $\pi$ is also a proper holomorphic map. I cannot see any good reason for this. I call for some help on the reasoning.
Any idea? Thanks.
EDIT1: I think the following generalization should be true:
Suppose $X,Y$ are two connected Riemann surfaces, and $F\colon X\to Y$ is a nonconst holomorphic map. If $F$ is onto and there's an integer $N$ such that the preimage $F^{-1}(y)$ is finite and the cardinality $\#F^{-1}(y)\le N$ for all $y\in Y$, then $F$ is proper.
It's wrong, as Daniel Fischer said.
EDIT2: If $X,Y$ are locally compact Hausdorff spaces and $p\colon X\to Y$ is a proper continuous map. Given $y\in Y$, and an open set $V\supseteq p^{-1}(y)$, there's an open set $W$ containing $y$ such that $p^{-1}(W)\subseteq V$. It follows from the fact that $p$ is closed, thus $W=Y\setminus p(X\setminus V)$ is open, and if $p(x)\in W$, then $p(x)\not\in p(X\setminus V)$ therefore $x\in V$. I don't know whether it's true in other ways, say when $p$ is holomorphic and the cardinalities of fibers are bounded.
EDIT3:
The following should be true:
Suppose $X,Y$ are two connected Riemann surfaces, and $F\colon X\to Y$ is a nonconst holomorphic map. If $F$ is onto and there's an integer $N$ such that the preimage $F^{-1}(y)$ is finite and sum of multiplicities of $F^{-1}(y)$ is exactly $N$ for all $y\in Y$, then $F$ is proper.
Note that $F$ is locally $z\mapsto z^m$ for some $m$ with appropriate charts and coordinates. That's the local behavior of a nonconst holomorphic map.
If $K$ is compact in $Y$ and there's an open covering $\mathcal C$ of $F^{-1}(K)$, then we only need to find a finite refinement $\mathcal C'$ of $\mathcal C$ (i.e., for each $\mathcal O'\in\mathcal C'$, there's an $\mathcal O\in\mathcal C$ such that $\mathcal O\supseteq\mathcal O'$), which is also a covering of $F^{-1}(K)$.
For each $y\in K$, let $F^{-1}(y)=\{x_1,\dotsc,x_n\}$ where $n\le N$. We choose a small neighborhood $V_y$ of $y$ and disjoint neighborhoods $W_1,\dotsc,W_n$ for $x_1,\dotsc,x_n$ such that $W_k$ is contained in some open set in $\mathcal C$, $F(W_k)=V_y$ and in local coordinates, $F$ acts as $z\mapsto z^{m_k}$. Moreover, for each $y'\in V_y\setminus\{y\}$, $\#(F^{-1}(y')\cap W_k)=m_k$. (We can choose them step by step. First, choose disjoint local charts and coordinates around $x_1,\dotsc,x_n$ and $y$ such that the map $F$ restricted to each local chart around $x_k$ is just $z\mapsto z^{m_k}$. WLOG, local charts are discs in the very local coordinates. Take the smallest radius $r$ of the images of discs, and let $V_y$ be the disc of radius $r$ centered at $y$, and the choice of $W_1,\dotsc,W_n$ is obvious). By assumption, $m_1+\dotsb+m_n=N$.
We claim that $F^{-1}(V_y)=\bigcup_{k=1}^n W_k$. It's a trick of counting. Note that $$\#F^{-1}(y')\ge\#\left(\bigcup_{k=1}^n(F^{-1}(y')\cap W_k)\right)=\sum_{k=1}^n\#(F^{-1}(y')\cap W_k)\ge m_1+\dotsb+m_n=N$$ But by assumption, the sum of multiplicities should be $N$, therefore $F^{-1}(y')=\bigcup_{k=1}^n(F^{-1}(y')\cap W_k)$.
Now $\bigcup_{y\in K}V_y\supseteq K$ and we can choose a finite covering, then the $W_k$'s for different $y$'s in the finite covering should be a finite refinement of $\mathcal C$. Q.E.D.
Without looking at the structure of $X$, we can argue elementarily by noting that the roots of a polynomial $p(z) = c_n z^n + \dotsc + c_1 z + c_0$ with $c_n\neq 0$ all lie in the closed disk $\{ z : \lvert z\rvert \leqslant R\}$, where
$$R = \max \left\{ 1, \lvert c_n\rvert^{-1}\sum_{k=0}^{n-1}\lvert c_k\rvert \right\}.$$
For, if $\lvert z\rvert > R$, then
$$\lvert p(z)\rvert \geqslant \lvert c_n\rvert \left(\lvert z\rvert^n - \lvert c_n\rvert^{-1}\sum_{k=0}^{n-1}\lvert c_k\rvert\,\lvert z\rvert^k\right) > \lvert c_n\rvert \lvert z\rvert^{n-1}\left(\lvert z\rvert - \lvert c_n\rvert^{-1}\sum_{k=0}^{n-1} \lvert c_k\rvert\right) > 0.$$
Let $\tilde{\pi} \colon \mathbb{C}^2 \to\mathbb{C}$ be the projection to the first component, so $\pi$ is the restriction of $\tilde{\pi}$ to $X$. If $K \subset \widehat{\mathbb{C}}\setminus E$ is compact, then $\pi^{-1}(K) = X \cap \tilde{\pi}^{-1}(K)$ is closed, since $X$ and $\tilde{\pi}^{-1}(K)$ are closed. Since $a_n(z) \neq 0$ on $K$, the continuous function
$$R(z) = \max \left\{ 1, \lvert a_n(z)\rvert^{-1}\sum_{k=0}^{n-1} \lvert a_k(z)\rvert\right\}$$ is bounded on $K$, and hence $X \cap \tilde{\pi}^{-1}(K)$ is bounded, thus $\pi^{-1}(K)$ is compact.
Alternatively, we can look at the geometry of $X$ to gain perhaps a little more insight:
Let $K \subset \widehat{\mathbb{C}}\setminus E$ be compact, and $L = \pi^{-1}(K)$. Since we're dealing with metrisable spaces, compactness coincides with sequential compactness. We show that $L$ is sequentially compact. So let $\bigl((z_k,w_k)\bigr)_{k\in\mathbb{N}}$ be a sequence in $L$. Since $K$ is compact, we can without loss of generality assume that $z_k \to z^\ast \in K$.
The hand-waved argument showing the reason that $\pi$ is proper goes:
$\pi$ is basically an $n$-sheeted (possibly branched) covering, so at least one sheet above a small neighbourhood of $z^\ast$ must contain infinitely many $(z_k,w_k)$, and thus we can extract a subsequence $\bigl((z_{k_m},w_{k_m})\bigr)_{m\in\mathbb{N}}$ in a single sheet, and that means that the $(z_{k_m},w_{k_m})$ converge to the point $(z^\ast,w^\ast)$ above $z^\ast$ in that sheet.
We must get rid of the hand-waving now:
If $P(z^\ast,\cdot)$ has only simple zeros, $\pi$ is unbranched above an open neighbourhood $U$ of $z^\ast$, and we have $n$ holomorphic functions $\varphi_j \colon U \to \mathbb{C}$ with disjoint ranges such that for every $z\in U$ we have
$$P(z,w) = 0 \iff \bigl(\exists j\in \{1,\dotsc,n\}\bigr)\bigl( w = \varphi_j(z)\bigr).$$
Namely, let $\zeta_1,\dotsc,\zeta_n$ be the $n$ distinct zeros of $P(z^\ast,\cdot)$. Pick a $\delta > 0$ such that $\lvert \zeta_j - \zeta_k\rvert > 2\delta$ for all $j \neq k$, and let
$$\varepsilon_j = \inf \left\{\lvert P(z^\ast, w)\rvert : \lvert w-\zeta_j\rvert = \delta\right\},$$
and $\varepsilon = \min \{ \varepsilon_1,\dotsc,\varepsilon_n\}$. Then $\varepsilon > 0$, and by continuity, there is an open neighbourhood $U$ of $z^\ast$ such that
$$\lvert P(z,w) - P(z^\ast,w)\rvert < \varepsilon$$
for $z\in U$ and $w \in \bigcup\limits_{j=1}^n \overline{B_\delta(\zeta_j)}$. By Rouché's theorem, $P(z,\cdot)$ has exactly one zero in $B_\delta(\zeta_j)$ for each $j$, and that zero is given by
$$\varphi_j(z) = \frac{1}{2\pi i} \int_{\lvert \zeta-\zeta_j\rvert = \delta} \zeta\cdot\frac{\frac{\partial P}{\partial w}(z,\zeta)}{P(z,\zeta)}\,d\zeta.$$
Since $z_k \to z^\ast$, by dropping some initial terms, we can assume that $z_k \in U$ for all $k$, and then we have $w_k = \varphi_{j_k}(z_k)$ for a uniquely determined $j_k \in \{1,\dotsc,n\}$. There must be at least one $j$ with $j = j_k$ for infinitely many $k$, and for such a subsequence we have $w_{k_m} = \varphi_j(z_{k_m}) \to \varphi_j(z^\ast) = \zeta_j$, and $(z^\ast,\zeta_j)\in L$.
If $P(z^\ast,\cdot)$ has zeros with multiplicity greater than $1$, the argument needs to be modified a little. Let the distinct zeros be $\zeta_1,\dotsc,\zeta_r$, with multiplicities $\mu_1,\dotsc,\mu_r$. With the same construction as above, for all small enough $\delta > 0$, we find an open neighbourhood $U(\delta)$ of $z^\ast$ such that $P(z,\cdot)$ has exactly $\mu_j$ zeros (counting multiplicity) in $B_\delta(\zeta_j)$, and hence no zeros outside $\bigcup\limits_{j=1}^r B_\delta(\zeta_j)$ (since $\sum\limits_{\rho=1}^r \mu_\rho = n$).
Thus we find a $j$ and a subsequence of $\bigl((z_k,w_k)\bigr)$ with $\lvert w_{k_m}-\zeta_j\rvert < \delta$ for all $m$, and repeating the argument for a sequence $\delta_\ell \downarrow 0$ shows that then $w_{k_m} \to \zeta_j$, so $(z_{k_m},w_{k_m}) \to (z^\ast,\zeta_j) \in L$.
It generalises to (non-constant) holomorphic maps between Riemann surfaces where all fibres have the same finite cardinality (counting multiplicities), with essentially the same argument. Since all fibres have the same cardinality, $f^{-1}(y)$ is contained in a small neighbourhood of $f^{-1}(x)$ when $y$ is close to $x$, and that ensures that every sequence $z_k$ with $f(z_k) \to x$ has one of the points in $f^{-1}(x)$ as an accumulation point.
If the fibres don't all have the same cardinality, the map need not be proper: consider the map $f\colon z \mapsto z^3$ from the upper half-plane to $\mathbb{C}\setminus\{0\}$, where for example $f^{-1}(\overline{B_{1/2}(1)})$ is not compact.
With regard to the third edit, the statement is correct, as mentioned above. Concerning your proof, it is not necessary to demand $F(W_k) = V_y$ for all, $k$, it suffices to have $V_y \subset F(W_k)$, that simplifies it a little, I think.