First a quick question regarding the definition of the axiom of choice. Do the sets have to be mutually disjoint nonempty sets or just non-empty? One source states: "For any set X of nonempty sets, there exists a choice function f defined on X." But another source states that the sets have to be mutually disjoint.
Secondly, pardon me if I sound ignorant (I'm learning this as a hobby so I don't have much background or time for it) but isn't it a really obvious/self-evident concept? I mean essentially, it is saying that if you have a collection of non-empty sets, then you can pick an element out of each set. I realize that there are difficulties when we cannot make explicit choices because we cannot create an explicit algorithm for the choice function (for example the collection of all nonempty subsets of the real line), but does that really matter?
I mean just like the number 5, the existence of the function 'f' is purely formal. Math isn't able to fully describe or prove everything but doesn't mean it doesn't exixt.
It doesn’t matter whether you require the sets to be pairwise disjoint or not: the two versions are equivalent. To see this, suppose that you have only the version for pairwise disjoint sets, and let $\mathscr{A}$ be any set of non-empty sets. For each $A\in\mathscr{A}$ let $A'=A\times\{A\}$, and let $\mathscr{A}'=\{A':A\in\mathscr{A}\}$; then $\mathscr{A}'$ is a set of pairwise disjoint non-empty sets, so it has a choice function $\varphi:\mathscr{A}'\to\bigcup\mathscr{A}'$ such that $\varphi(A')\in A'$ for each $A'\in\mathscr{A}'$. But then $\varphi(A')=\langle a,A\rangle$ for some $a\in A$, so $\pi\circ\varphi$ is a choice function for $\mathscr{A}$, where $\pi$ is the projection function that picks out the first component of an ordered pair.
The axiom of choice does seem self-evident at first sight, but it has some consequences that are far from self-evident and indeed seem very unlikely at first sight. For instance, you might like to read about the Banach-Tarski paradoxical decomposition of the sphere. And it turns out that it is neither a consequence of nor in conflict with the usual axioms of set theory: it is independent of them, but also consistent with them.