Does marginal likelihood on the training set always weakly increase for GPs when adding new features, irrespective of the kernel/hyperparams?

23 Views Asked by At

Ive recently been introducing myself to Gaussian Processes. In Bayesian linear regression, one would expect that when adding new features, the likelihood on the training set would weakly increase due to the larger degree of freedom, of course leading to overfitting most likely. I was wondering if this is true in general for marginal likelihood for GPs, regardless of the choice of kernel/specific hyperparameters? If so, what is formal explanation for this? Can we prove it mathematically?

Thanks :)