I am trying to define the gradient of $x^Ty$ with respect of $x$ where both $x, y$ are column vectors $\in \mathbb{R}^m$.
$\frac{\partial x^Ty}{\partial x} = [\frac{\partial x^Ty}{\partial x_1} , \frac{\partial x^Ty}{\partial x_2} , ... , \frac{\partial x^Ty}{\partial x_m}] = [y_1, y_2, ..., y_m] = y^T$
or is it
= $[\frac{\partial x^Ty}{\partial x_1} , \frac{\partial x^Ty}{\partial x_2} , ... , \frac{\partial x^Ty}{\partial x_m}]^T = [y_1, y_2, ..., y_m]^T = y$
I am quite confused since I haven't been exposed to multivariate calculus before.
There are two conventions, as long as you remain consistent, either is fine.
From the wikipedia page of matrix calculus, the column layout is known as numerator layout while the row layout is known as the denominator layout.