So the definition I know for metric compatibility is:
$$Xg(Y,Z)=g(\nabla_XY,Z)+g(Y,\nabla_XZ),$$
which make sense, as $g(Y,Z)$ is a smooth function from the manifold to reals and we think of $X$ as a derivation. So now I read this apparent equivalent definition that says $\nabla g=0$. Can someone explain what this means? How can I do $\nabla$ of $g$ I thought $g_p$ is an element of $T_p^*M\otimes T_p^*M$ at every point. Furthermore after you explain the meaning of this can you show me that these two definitions are indeed equivalent?
It seems you are missing some necessary background, namely, how to extend a connection on $TM$ to all tensor bundles. I will summarise this construction.
Given a connection
\begin{align*} \nabla : \Gamma(TM) \times \Gamma(TM) &\to \Gamma(TM)\\ (X, Y) &\mapsto \nabla_XY \end{align*}
on $TM$, there is an associated connection (which I will also denote $\nabla$) on $T^*M$ given by
\begin{align*} \nabla : \Gamma(TM) \times \Gamma(T^*M) &\to \Gamma(T^*M)\\ (X, \alpha) &\mapsto \nabla_X\alpha \end{align*}
where $(\nabla_X\alpha)(Y) := X(\alpha(Y)) - \alpha(\nabla_XY)$. With this definition, together with the definition $\nabla_Xf = Xf$ for a smooth function $f$, we see that the following identity holds:
$$\nabla_X(\alpha(Y)) = (\nabla_X\alpha)(Y) + \alpha(\nabla_XY).$$
More generally, given a $(p, q)$-tensor $T$, we implicitly define the covariant derivative $\nabla_XT$, which is again a $(p, q)$-tensor, by the following equation:
\begin{align*} \nabla_X(T(Y_1, \dots, Y_p, \alpha_1, \dots, \alpha_q)) =&\ (\nabla_XT)(Y_1, \dots, Y_p, \alpha_1, \dots, \alpha_q)\\ &+ \sum_{i=1}^pT(Y_1, \dots, \nabla_XY_i, \dots, Y_p, \alpha_1, \dots, \alpha_q)\\ &+ \sum_{j=1}^qT(Y_1, \dots, Y_p, \alpha_1, \dots, \nabla_X\alpha_j, \dots, \alpha_q).\qquad (\ast) \end{align*}
One could instead consider the covariant derivative of $T$ as a $(p+1, q)$-tensor $\nabla T$ given by
$$(\nabla T)(X, Y_1, \dots, Y_p, \alpha_1, \dots, \alpha_q) := (\nabla_XT)(Y_1, \dots, Y_p, \alpha_1, \dots, \alpha_q).$$
Now, $g$ is a $(2, 0)$-tensor. So if $Y$ and $Z$ are vector fields, $g(Y, Z)$ is a smooth function and hence
$$\nabla_X(g(Y, Z)) = X(g(Y, Z)).$$
On the other hand, by $(\ast)$,
$$\nabla_X(g(Y, Z)) = (\nabla_Xg)(Y, Z) + g(\nabla_XY, Z) + g(Y, \nabla_XZ).$$
Using these two equations, we see that
$$(\nabla g)(X, Y, Z) = (\nabla_X g)(Y, Z) = X(g(Y, Z)) - g(\nabla_XY, Z) - g(Y, \nabla_XZ).$$
So $\nabla$ is compatible with the metric $g$ if and only if $(\nabla g)(X, Y, Z) = 0$ for all vector fields $X, Y, Z$ (i.e. $\nabla g = 0$).