The Probability Mass Function of Bernoulli distribution looks like this:
\begin{equation} p : \mathcal{X} \to [0,1], x\mapsto p(x) := \mu^{x}(1 - \mu)^{1-x} \mbox{ for } \mu \in [0,1]. \end{equation}
I am interested in it's derivation. How did someone determine it to have exactly this form, are there any mathematical steps, which lead to this function?
The expression $\mu^x(1-\mu)^{1-x}$ is equal to $\mu$ when $x = 1$ and $1-\mu$ when $x = 0$. It's just a more compact way of writing $P[x=1] = \mu$ and $P[x=0] = 1-\mu$:
$$ P[x=0] = \mu^0(1-\mu)^{1-0} = 1-\mu , $$
$$ P[x=1] = \mu^1(1-\mu)^{1-1} = \mu . $$