Authors usually write that $*$ is associative on a set $S$ if,
$(a*b)*c=a*(b*c)$ $\forall a,b,c \in S$
I think it should have been,
$(a*b)*c=a*(b*c)=(a*c)*b$ $\forall a,b,c \in S$
I made all possible groups i.e. $(a,b)$, $(b,c)$ and $(a,c)$ so that for example if set were $\mathbb R$ and operator were "$.$" then I would be able to re write this expression in 3 ways,
$$1.(2.3)=(1.2).3=(1.3).2$$
, which is actually true.
The problem with binary operations is that they are defined on just two elements, i.e. $a*b$. Therefore, when it comes to calculating $a∗b∗c$, we don't know how to apply the operation because there are now three elements. This is why we use the brackets, and say "we do $a*b$ first, then we'll apply $c$ to the result", i.e. $(a*b)*c$. Then, in a set where associativity of the binary operation holds, we know it's okay to do $a*(b*c)$ instead of $(a*b)*c$.
The reason why $(a*c)*b$ isn't the same, is because that comes down to helping us answer the question of "what is $a*c*b$?", and this is only ever the same question as "what is $a*b*c$?" when we also have commutativity of the elements $b$ and $c$.
Hope this helps.