I am trying to differentiate the logarithm marginal likelihood resulting from Bayesian Linear Regression $\mathcal{L}$, which is a real-valued function, with respect to scalars $\alpha$ and $\beta$.
$$\mathcal{L}(\alpha,\beta)= -\frac{N}{2} \log{(2\pi)} - \frac{1}{2}\log{(\left | \mathbf{\Phi \alpha I \Phi^{T}} + \beta \mathbf I\right |)} - \frac{1}{2}\mathbf{y}^T(\mathbf{\Phi \alpha I \Phi^{T}} + \beta \mathbf I)^{-1}\mathbf y$$
where $\mathbf{\Phi} \in\mathbb{R}^{N\times M}; \mathbf{y}\in\mathbb{R}^{N\times1}$
So I am trying to find $\frac{\partial\mathcal{L}}{\partial\alpha}$ and $\frac{\partial\mathcal{L}}{\partial\beta}$.
I've been trying to get $\frac{\partial\mathcal{L}}{\partial\alpha}$ but I keep getting nowhere and I think it's due to poor matrix calculus foundations on my part. Could someone help me so I can learn?
For typing convenience, define a new symmetric matrix (and its differential) $$\eqalign{ X &= \alpha\Phi\Phi^T+\beta I &= X^T \cr dX &= \Phi\Phi^T\,d\alpha + I\,d\beta \cr }$$ Also recall the definition of the trace/Frobenius product $$A:B={\rm Tr}(A^TB)$$ Then the likelihood function can be written as $$L = L_0 - \tfrac{1}{2}yy^T:X^{-1} - \tfrac{1}{2}\log(\det(X))$$
Following the suggestion of user550103, start by computing the differential. $$\eqalign{ dL &= - \tfrac{1}{2}yy^T:dX^{-1} - \tfrac{1}{2}\,d\log(\det(X)) \cr &= \tfrac{1}{2}yy^T:X^{-1}\,dX\,X^{-1} - \tfrac{1}{2}X^{-1}:dX \cr &= \tfrac{1}{2}X^{-1}\Big(yy^T-X\Big)X^{-1}:dX \cr &= \tfrac{1}{2}X^{-1}\Big(yy^T-X\Big)X^{-1}:(\Phi\Phi^T\,d\alpha+I\,d\beta) \cr }$$ Setting $d\beta=0$ yields the gradient with respect to $\alpha$ $$ \frac{dL}{d\alpha} = \tfrac{1}{2}X^{-1}\Big(yy^T-X\Big)X^{-1}:\Phi\Phi^T $$ while setting $d\alpha=0$ yields $$ \frac{dL}{d\beta} = \tfrac{1}{2}X^{-1}\Big(yy^T-X\Big)X^{-1}:I $$