On Wikipedia it says that a binary relation $R$ on a class $X$ is well-founded if every non-empty subset $S \subseteq X$ contains a minimal element, i.e. there is some $x \in S$ such that $(y,x) \notin R$ for all $y \in S$. Then if we have a set-like (for every $x \in X$ the subclass $\{ y \colon (y,x) \in R \}$ is a set), well-founded relation $R$ on a class $X$, then we can perform induction on $R$.
I'm wondering if the above definition of well-founded relation is really sufficient for induction on a class. Is it not actually necessary for every subclass of $X$ to have a minimal element?
I understand that the sketch of the proof that the induction works goes as follows. Suppose that for some proposition we have proven that for any $x \in X$ if this proposition is true for all $y \in X$ such that $(y,x) \in R$, then it is also true for $x$. Now we can prove by contradiction that the proposition is therefore true for all $x \in X$. Assume that there exists some $x \in X$ for which the proposition fails. Let $C \subseteq X$ be the non-empty subclass of $X$ consisting of all counterexamples. There is a minimal element $x \in C$ and so the proposition is true for all $(y,x) \in R$. But by our assumption if the proposotion is true for all $(y,x) \in R$, then it is also true for $x$, which contradicts that $x$ is a counterexample.
In this proof I used the stronger version of well-foundedness that says all subclasses have minimal elements. My question is if there is an alternative proof that does not need this stronger version of well-foundedness, or if the proper definition of well-founded induction on a class actually needs this stronger version.
If it is the latter then is there a name for such a relation, e.g. a strongly well-founded relation?
EDIT: I'm asking if well-foundedness in the weaker sense + set-likeness is sufficient for induction on a class, or if one needs well-foundedness in the stronger sense + set-likeness.
You don't need the stronger version of well-foundedness. The idea is that any failure of well-foundedness is in some sense "local" and can be witnessed on just a set.
Let me first sketch the argument intuitively. Suppose there is a nonempty subclass $C\subseteq X$ with no minimal element. Pick an element $c_0\in C$. Since $c_0$ is not minimal, you can pick $c_1\in C$ such that $c_1Rc_0$. Since $c_1$ is not minimal, you can pick $c_2\in C$ such that $c_2Rc_1$. Continuing recursively, you can pick a sequence $(c_n)_{n\in\mathbb{N}}$ of elements of $C$ such that $c_{n+1}Rc_n$ for each $n$. But now consider $\{c_n\}_{n\in\mathbb{N}}$. This is a set by Replacement since $\mathbb{N}$ is a set, and has no minimal element, which is a contradiction.
Now, there are some technical issues with this argument as stated. To actually carry out this recursion, we need to use the axiom of choice to choose our infinitely many $c_n$'s. Moreover, doing so requires us to fix a choice function ahead of time, which must be defined on a set (the axiom of choice only applies to sets, not classes), so we need to know ahead of time there is a set from which we will be making all our choices. The trick now is that we can use the set-like condition on $R$ to find such a set. And in fact, once we've found such a set, we can formulate the argument to avoid using the axiom of choice.
Here's how it works. After picking $c_0$, instead of picking a single $c_1\in C$ such that $c_1Rc_0$, we take the set $C_1$ of all such elements $c_1$, which is a set since $R$ is set-like. For convenience, let me give a name to the operation we just did: if $c\in C$, let $F(c)$ be the set of all $d\in C$ such that $dRc$. So we've defined $C_1=F(c_0)$
Now for the next step, we'd like to do the same thing and define $C_2=F(c_1)$, except that we don't have a single element $c_1$ but instead an entire set $C_1$ of "candidate" values for $c_1$. That's no problem; we just do it for all our candidates. That is, we define $C_2=\bigcup_{c\in C_1}F(c)$. We can continue this recursively to define a sequence $(C_n)_{n\in\mathbb{N}}$ of sets such that $C_0=\{c_0\}$ and $C_{n+1}=\bigcup_{c\in C_n}F(c)$ for each $n$.
Now if we take $S=\bigcup_{n\in\mathbb{N}} C_n$, this $S$ is a set and we can carry out our original intuitive argument by fixing a choice function on its nonempty sets ahead of time. Actually, though, we don't even need to make that argument. We can just directly observe that $S$ has no minimal element: any $c\in S$ is in some $C_n$, and then $F(c)\subseteq C_{n+1}\subseteq S$. Since $c$ is not minimal in $C$, $F(c)$ is nonempty, and any element of it witnesses that $c$ is not minimal in $S$.
Finally, let me remark that assuming the axiom of regularity, you don't even need to assume that $R$ is set-like, since you can use Scott's trick. Namely, in the argument above, instead of letting $F(c)$ be the set of all $d\in C$ such that $dRc$ (which may not be a set if $R$ is not set-like), you can let $F(c)$ be the set of all such $d$ of minimal rank. That is, let $\alpha$ be the least ordinal such that there exists $d\in C\cap V_\alpha$ such that $dRc$, and then let $F(c)$ be the set of all such $d\in C\cap V_\alpha$ (which is a set since $V_\alpha$ is a set). With this modified definition of $F$, the rest of the argument still works just as before.