Optimal design for constrained Bayesian slope intercept model

24 Views Asked by At

Here is a problem I've been stuck on for quite a while. Consider the model \begin{equation} \mathbf{y}=\mathbf{H}\pmb{ \theta }+\pmb{\epsilon }. \end{equation} The design matrix is given by: \begin{equation} \mathbf{H}\pmb{}=\left( \begin{array}{cc} 1&t_{1}\\ \vdots&\vdots\\ 1&t_{m}\\ \end{array} \right), \end{equation} where $0\leq t_j\leq T$. The parameter vector is given by: \begin{equation} \pmb{\theta }\sim \mathcal{N}\left(\pmb{\mu},\mathbf{C}_{\theta}\right), \end{equation} where \begin{equation} \mathbf{C}_{\theta}=\left(\begin{array}{cc} \delta_1^2&0\\ 0&\delta_2^2\\ \end{array}\right). \end{equation} Assume that the errors are distributed as follows: \begin{equation} \pmb{\epsilon }\sim \mathcal{N}\left(\pmb{0},\mathbf{C}_{\epsilon}\right), \end{equation} where \begin{equation} \mathbf{C}_{\epsilon}=\sigma^2\pmb{I}. \end{equation} For this model, the inverse of the between-subjects covariance matrix is given by: \begin{equation} \mathbf{C}_{\theta}^{-1}=\left(\begin{array}{cc} \frac{1}{\delta_1^2}&0\\ 0&\frac{1}{\delta_2^2}\\ \end{array}\right). \end{equation} Furthermore, the response BMSE (i.e., the BMSE of a model prediction at some $t_{j^*}$ which is also between $0$ and $T$) is given by

\begin{equation} \mathbf{M}_{\hat{f}_{j^*}}=\mathbf{h}'\left(\mathbf{C}_{\theta}^{-1}+\mathbf{H}'\mathbf{C}_{\epsilon}^{-1}\mathbf{H}\right)^{-1}\mathbf{h}, \end{equation} where $\mathbf{h}$ is the covariate vector at the time at which we want to make predictions: \begin{equation} \mathbf{h}=\left(\begin{array}{c} 1\\ t_{j^*}\\ \end{array}\right), \end{equation} and let $j^*$ denote the index of a response $y_{j^*}$ at some future time $t_{j^*}$.

I would like to prove that when \begin{equation} \frac{t_{j^*}(\frac{\sigma^2}{m}+\delta_1^2)}{\delta_1^2}>T, \end{equation} then $\mathbf{M}_{\hat{f}_{j^*}}$ is minimized inside the feasible region by collecting data at times such that $t_1=\dots=t_m=T$.

Was able to prove this result for the case of two data points, but having a lot of trouble generalizing it to m data points.