Question. Do you know a specific example which demonstrates that the tensor product of monoids (as defined below) is not associative?
Let $C$ be the category of algebraic structures of a fixed type, and let us denote by $|~|$ the underlying functor $C \to \mathsf{Set}$. For $M,N \in C$ we have a functor $\mathrm{BiHom}(M,N;-) : C \to \mathsf{Set}$ which sends an object $K \in C$ to the set of bihomomorphisms $M \times N \to K$, i.e. maps $|M| \times |N| \to |K|$ which are homomorphisms in each variable when the other one is fixed. Then one can show as usual that $\mathrm{BiHom}(M,N;-)$ is representable and call the universal bihomomorphism $M \times N \to M \otimes N$ the tensor product of $M,N$. This is a straight forward generalization of the well-known case $C=\mathsf{Mod}(R)$ for a commutative ring $R$.
Actually, this is a special case of a more general tensor product in concrete categories, studied in the paper "Tensor products and bimorphisms", Canad. Math. Bull. 19 (1976) 385-401, by B. Banaschewski and E. Nelson.
Here are some examples: For $C=\mathsf{Set}$, the tensor product equals the usual cartesian product. This is also true for $C=\mathsf{Set}_*$. For $C=\mathsf{Grp}$, we get $G \otimes H \cong G^{\mathsf{ab}} \otimes_{\mathbb{Z}} H^{\mathsf{ab}}$, using the Eckmann-Hilton argument. (This differs from the "tensor product of groups" studied in the literature). The case $C=\mathsf{CMon}$ is very similar to the well-known case $C=\mathsf{Ab}$ and is spelled out here; namely, we have internal homs and therefore a hom-tensor-adjunction. The same is true for $C=\mathsf{Mod}(\Lambda)$ for a commutative algebraic monad $\Lambda$, see here, Section 5.3.
Note that the tensor product is commutative, and that it commutes with filtered colimits in each variable. However, the case $C=\mathsf{Grp}$ shows that it does not have to commute with coproducts. In particular, tensoring with some object is no left adjoint. Also, the free object on one generator is not a unit in general:
Let us consider $C=\mathsf{Mon}$. Then, we have
$\mathbb{N} \otimes M = M / \{ (mn)^p = m^p n^p \}_{m,n \in M, p \in \mathbb{N}}$
The usual proof of the associativity of the tensor product breaks down: There is a map $\beta : M \times (N \otimes K) \to (M \otimes N) \otimes K$ mapping $(m, n \otimes k) \mapsto (m \otimes n) \otimes k$, which is a homomorphism in the second variable. But what about the first variable? The equation $\beta(mm',t) = \beta(m,t) \beta(m',t)$ is clear if $t \in N \otimes K$ is a pure tensor. But for $t=(n \otimes k) (n' \otimes k')$ we end up with the unlikely equation
$((m \otimes n) \otimes k) ((m' \otimes n) \otimes k) ((m \otimes n') \otimes k') ((m' \otimes n') \otimes k')$ $=((m \otimes n) \otimes k) ((m \otimes n') \otimes k') ((m' \otimes n) \otimes k) ((m' \otimes n') \otimes k')$
This does not exactly answer your question, but it should be pointed out that in some situations such as groups, Lie algebras, ... one wants to consider other kinds of tensor products in which the key notion is that of a biderivation. An example of this is the commutator map $[\; ,\; ]: M \times N \to G$ where $M,N$ are normal subgroups of the group $G$. See a bibliography on this nonabelian tensor product with 120 items.