Let $B \in$ GL$_n(\mathbb{C})$. In a paper I'm reading someone probably claims the following:
Lemma: For showing that $B$ is diagonalizable it suffices to show the following: Let $\lambda$ be an eigenvalue of $B$ with algebraic multiplicity $\geq 2$ and assume $x, y \in \mathbb{C}^n$ with \begin{align*} & (B - \lambda I)x = y \\ & (B - \lambda I)y = 0 \end{align*} then we have $y = 0$.
Why is this true? Does it have to do with the Jordan normal form?
Basically, the condition in the lemma, the one that is sufficient to give you diagonalisability (and, as it turns out, is actually equivalent), boils down to the following: $$\operatorname{ker}(B - \lambda I)^2 = \operatorname{ker}(B - \lambda I),$$ where the $\operatorname{ker}$ is the kernel (or nullspace) of the matrix. To see this, consider $x$ in the statement of the lemma. The two statements, when substituted together amount to $(B - \lambda I)^2 x = 0$, that is, $x \in \operatorname{ker}(B - \lambda I)^2$. The lemma requires that $(B - \lambda I)x = 0$ in this case, that is, $x \in \operatorname{ker}(B - \lambda I)$. Thus, $\operatorname{ker}(B - \lambda I)^2 \subseteq \operatorname{ker}(B - \lambda I)$. The other subset inclusion is always true, and easy to show.
Why does this condition imply diagonalisability? Well, regardless of the matrix $B$, we have the following chain of set inclusion: $$\lbrace 0 \rbrace \subseteq \operatorname{ker}(B - \lambda I) \subseteq \operatorname{ker}(B - \lambda I)^2 \subseteq \operatorname{ker}(B - \lambda I)^3 \subseteq \ldots$$ This is straightforward to prove. Basically, if you apply $(B - \lambda I)^i$ to a vector and get $0$, then applying $(B - \lambda I)$ once more will still send the vector to $0$. Slightly less trivial to show is that once $\operatorname{ker}(B - \lambda I)^i = \operatorname{ker}(B - \lambda I)^{i+1}$, then $$\operatorname{ker}(B - \lambda I)^i = \operatorname{ker}(B - \lambda I)^{i+1} = \operatorname{ker}(B - \lambda I)^{i+2} = \ldots$$ That is, once the kernel stops growing in one step, it stops growing for good. Once the kernel stops growing, this is the generalised eigenspace of $B$ with respect to $\lambda$, if $\lambda$ is an eigenvalue (if $\lambda$ isn't, then all of the above kernels are trivial). It's not too hard to prove this, but I'll leave it out of the answer (I'll be happy to provide the proof if you want it, but it's a good exercise). So we have,
$$\lbrace 0 \rbrace \subset \operatorname{ker}(B - \lambda I) \subset \operatorname{ker}(B - \lambda I)^2 \subset \ldots \subset \operatorname{ker}(B - \lambda I)^i = \operatorname{ker}(B - \lambda I)^{i+1} = \ldots$$
But, what does our condition imply? It means that we reach equality at $i = 1$. So, we have
$$\lbrace 0 \rbrace \subset \operatorname{ker}(B - \lambda I) = \operatorname{ker}(B - \lambda I)^2 = \operatorname{ker}(B - \lambda I)^3 = \ldots$$
The generalised eigenspace is therefore $\operatorname{ker}(B - \lambda I)$, which is literally the (not generalised) eigenspace of $B$ corresponding to eigenvalue $\lambda$. Every generalised eigenvector is a (not generalised) eigenvector.
Now, the eigenspaces direct sum to the entirety of $\mathbb{C}^n$. Another way to view this is to look at an arbitrary Jordan basis. (Neither of these facts I can elegantly prove here.) Either way, you can form a basis of generalised eigenvectors, but since every generalised eigenvector is an eigenvector, you can form a basis of eigenvectors. It's easy to see that changing $B$ in terms of this basis of eigenvectors will make $B$ diagonal, so $B$ is indeed diagonalisable.
Hope that helps!