I am getting confused over index and matrix notation as it applies to the derivative of a vector. For example, given the regularization parameter $\frac{\lambda}{2} \theta^T \theta$, where $\theta$ is a $n \times 1$ vector, converting this to index notation I get:
$$ \begin{align} \frac{\lambda}{2} \theta^T \theta &= \frac{\lambda}{2} \begin{bmatrix} \theta_1 & \theta_2 & ... & \theta_n \end{bmatrix} \begin{bmatrix} \theta_1 \\ \theta_2 \\ ... \\ \theta_n \end{bmatrix} \\ &= \frac{\lambda}{2} \big( \theta_1^2 + \theta_2^2 + ... + \theta_n^2 \big) \\ &= \frac{\lambda}{2} \sum_{i=1}^{n} \theta_i^2 \end{align} $$
Now if I want to find $\frac{d}{d\theta} \frac{\lambda}{2} \theta^T \theta$ but using index notation, it seems that this would be:
$$ \begin{align} \frac{d}{d\theta} \frac{\lambda}{2} \theta^T \theta &= \frac{d}{d\theta} \frac{\lambda}{2} \sum_{i=1}^{n} \theta_i^2 \\ &= \lambda \sum_{i=1}^{n} \theta_i \end{align} $$
Where I get confused is that if I were take the derivative using the matrix notation, I believe I would get something like $\lambda \theta$. However $\lambda \theta$ is a $n \times 1$ vector while $\lambda \sum_{i=1}^{n} \theta_i$ is a scalar.
Where am I going wrong? Is it because I am taking the dot product when I convert to index notation? I would greatly appreciate help in clarifying this.
your second calculation is wrong. When you write $\frac{d}{d\theta}$, that is actually $n$ separate derivatives, usually written in a row. I.e. $\frac{d}{d\theta}=(\frac{d}{d\theta_1},...,\frac{d}{d\theta_n})$. So the correct result should be $\frac{d}{d\theta}\frac{\lambda}{2}\theta^T\theta=\lambda\theta^T$, which is a vector.
Usually the convention is to write derivatives with respect to a column-vector as a row-vector and vice-versa. But that might depend on the literature you are reading (especially phsicists and such are aften not good in distinguishing row- and column-vectors).