I am trying to recover the definition of ‘monad’ from a more basic setting. (The reason for doing so, is that this would nicely illustrate the "necessity" of the concept of a 'monad'.)
Let $\mathcal{C}$ be any given, fixed category. Let $T$ be an endofunctor on $\mathcal{C}$. Define the "Kleisli wannabe-category" $\mathcal{C}_T$ as having the same objects as $\mathcal{C}$, and as arrows from $X$ to $Y$ in $\mathcal{C}_T$ it has precisely the arrows from $X$ to $TY$ in $\mathcal{C}$.
Our aim is the following: We assume that we are given all the ingredients to make $\mathcal{C}_T$ into an actual category (i.e., (a) we have an identity arrow for each object, and (b) the composition of any two arrows in the Kleisli category is defined and associative), and we wish to derive from this that there must be a monad at play.
To be precise, let us assume that we have an operation $$\circ_T : \mathcal{C}_T ( Y,Z) \times \mathcal{C}_T ( X,Y) \to \mathcal{C}_T ( X,Z) \qquad \qquad (\ast)$$ which is natural in $X,Y,Z$ (this represents the composition in the Kleisli category) [EDIT: as was pointed out in the comments, for technical reasons we must additionally assume that it is dinatural in $Y$] and satisfies associativity, and furthermore, there is an identity arrow for each object (so an arrow $\eta_X : X \to TX$ in $\mathcal C$ for each object $X$ which when composed in the Kleisli category with any other arrow leaves it untouched).
Recall the following definition:
Definition: A Kleisli triple consists of a category $\mathcal{C}$, a mapping of the objects of $\mathcal C$ given by $T : \mathcal{C}\to \mathcal{C}$ , $\eta_A : A \to TA$ for each object $A$ of $\mathcal C$, and $(-)^* : \mathsf{Hom}(A,TB)\to\mathsf{Hom}(TA,TB)$ for all objects $A, B$ of $\mathcal{C}$, satisfying
(1) $\eta_A^* = \mbox{Id}_{TA}$,
(2) $f^*\circ\eta_A = f$, and
(3) $f^* \circ g^* = (f^* \circ g)^*$.
It is well known (see Eugenio Moggi's Notions of Computation and Monads) that there is a one-one correspondence between Kleisli triples and monads. Hence it suffices for us to come up with a Kleisli triple.
This has been my approach so far: If we instantiate $(\ast)$ for $X=TA, Y=A, Z=B$, then we get $$ \circ_T : \mathcal C ( A,TB) \times \mathcal C ( TA,TA) \to \mathcal C ( TA,TB), $$ so for any $f : A \to TB$ we can define $f^* : TA\to TB$ by $$ f^* = f \circ_T \mbox{Id}_{TA}. $$ Then we indeed get that (1) is satisfied, as $ \eta_A^* = \eta_A \circ_T \mbox{Id}_{TA}= \mbox{Id}_{TA}. $ To show (2), we have to prove that $$ f^* \circ \eta_A = (f \circ_T \mbox{Id}_{TA} ) \circ \eta_A $$ is equal to $f$. However, I haven't been able to prove this (nor point (3)).
Any help would be much appreciated! Thanks in advance.
Naturality of $\circ$ in X says that for $\alpha : X' \to X$, $f: Y \to Z$ and $g : X \to Y$, $(f \circ_T g) \circ \alpha = f \circ_T (g \circ \alpha)$. Basically, we have something that looks like associativity when both kinds of composition are mixed. You should try to write down what the corresponding equations are for $Y$ and $Z$. This will involve $T$ more directly.
Applying that, we get $(f \circ_T \mbox{Id}_{TA} ) \circ \eta_A = f \circ_T (\mbox{Id}_{TA} \circ \eta_A)$. This is equal to $f \circ_T \eta_A$ since $\mbox{Id}_{TA}$ is the identity for normal composition. Then $\eta_A$ is (by hypothesis) the identity for $\circ_T$, so this results in $f$.
$f^* \circ g^* = (f^* \circ g)^*$ can be proven in a similar way using naturality in $X$ as well as associativity of $\circ_T$. Unpack the definitions and simplify using the identity for $\circ$.
Dinaturality/extranaturality in $Y$ (they're the same here) of composition is not necessary to get a monad, but the action of T on morphisms induced by $(-)^*$ and the original action of T on morphisms may not match otherwise. In other words, we want $T(f) = (\eta_Y \circ f)^*$. Using our definition of $(-)^*$, this is equivalent to $T(f) = (\eta_Y \circ f) \circ_T \mbox{Id}_{TX}$, which follows from extranaturality in $Y$: $(\eta_Y \circ f) \circ_T \mbox{Id}_{TX} = \eta_Y \circ_T (T(f) \circ \mbox{Id}_{TX}) = T(f)$.
Naturality in $Z$ should follow from everything else. I've checked this by using $T(f) = (\eta_Y \circ f)^*$, $f \circ_T g = f^* \circ g$ and the equations (1) - (3).