Suppose that u is a function on space-time. We can define a vector field with components $\triangledown^a u = g^{ab}\partial_b u$.
Why have they chosen $g^{ab}$ and not $g_{ab}$ here?
The wave operator sends $u$ to
$$\square u = \triangledown_a(\triangledown^a u)$$
Why here does you use superscript $a$ first and then subscript $a$?
In special and general relativity, whether an index is a subscript or superscript matters. We replace the invariant infinitesimal squared length in Euclidean spacetime expressible in terms of Cartesian coordinates viz. $\sum_a dx_a^2$ with a more general quadratic function, which requires us to introduce a square matrix viz. $ds^2=g_{ab}dx^adx^b$. From this we get something that satisfies all inner product axioms, except for positive definiteness; in fact the positive and negative eigenvalues of $g_{ab}$ (this matrix is called the meric tensor) represent space and time (which way round depends on which of two conventions you use). You always sum over two matching indices at different heights; this is called contraction. You can use the metric or its inverse to change the index height, as in your first equation. Note in particular that $\partial_b u=g_{bc}\nabla^c u$.