Letting $C_1$ and $C_2$ be two algebraic curves sitting in $\mathbb CP^2$, they both realize homology classes in $H_2(\mathbb CP^2)$ and are in fact represented by $n \cdot [\mathbb CP^1]$ and $m \cdot [\mathbb CP^1]$.
If the curves intersect transversely, then we have that $n \cdot m=[C_1]^* \smile [C_2]^*=[C_1 \cap C_2]^* \in H^4(\mathbb CP^2)$.
By dualizing, one realizes that $n \cdot m=[C_1 \cap C_2] \in H_0(\mathbb CP^2)$ which proves the case for Bézout's theorem for transverse intersections.
Can one generalize this argument for curves intersecting non-transversely?
My guess is that since this number depends only on homology class, one can perturb such cases in order to specialize back to this proof.
This is a good question, and you're right that you can perturb without changing intersection numbers while making them transverse. The crucial question is why doing this changes a multiplicity $n$ intersection - an ($n-1$)-fold tangency - to $n$ positive intersections. The point is that near intersections of curves in 2-space, we may model the intersection as the zero set of a holomorphic map $\Bbb C \to \Bbb C$ (near zero). Then we know that holomorphic maps are, up to a coordinate change, given by $z \mapsto z^k$; a multiplicity $n$ intersection corresponds to the map $z \mapsto z^n$. Then consider the homotopy $f_t(z) = z^n - t$, for $t \in [0, \varepsilon]$. For $t > 0$, this has $n$ distinct zeroes $t^{1/n} e^{2\pi i/n}$, all of which are positive (holomorphic zeroes always are).
Now observing that this homotopy never crosses zero outside of $|z|\leq t^{1/n}$, we may 'dampen out its effects' near $\infty$ to get a compactly supported homotopy from $z^n$ to a function $f(z)$ which is $z^n - t$ for small $|z|$ and $z^n$ for large $|z|$ (and the homotopy is through functions identical to $z^n$ for large $|z|$); I don't want to write out a formula, but it is not hard to see this is possible. In particular, we may change the non-transverse multiplicity $n$ zeroes to $n$ positive transverse zeroes without changing the curve far away from the zero in question.
Now your argument shows that there are $nm$ of these points.