I am looking for a proof and ideally a good reference for results of the following type, which I came across in a statistics paper (difficult to find online): Second Order Minimax Estimation in Partial Linear Models by G. Golubev and W. Hardle (2000). Unfortunately the references therein are to older textbooks that I haven't been able to find. This is likely a straight forward result in the approximation literature theory but I'm not very knowledgeable of the area.
Given a class of functions $$ W(\beta, L) = \{f : \int_0^1 [f^{(\beta)}(x)]^2dx \le L, \int_0^1 q(x)f(x)dx=0 \}, \qquad \beta \in \mathbb{N}, L>0, $$ where $q$ is a probability density function. We want to construct a best approximating orthonormal system $\{\psi_k\}\subset L^2([0,1], q)$ that minimizes (over all orthonormal sequences) the error:
$$ \sup_{f \in W(\beta,L)} \left \| f - \sum_{k=1}^s \langle f, \psi_k \rangle \psi_k \right \|_{q} $$
for any $s \in \mathbb{N}$.
Claim 1: Take $\psi_0=1$ and $\{\psi_k\}_{k=1}^{\beta-1}$ to be the the orthonormal polynomials in $L^2([0,1],q)$. The remaining functions are obtained by $$ \psi_{\beta+l} = \frac{\arg \max_{\varphi\in W, \varphi\perp \{ \psi_j\}_{j=1}^{\beta+l-1}} \| \varphi\|_q}{\max_{\varphi\in W, \varphi\perp \{ \psi_j\}_{j=1}^{\beta+l-1}} \| \varphi\|_q}. $$
Claim 2: More generally, $\psi_s$ are the solutions to the following boundary value problem: \begin{align*} (-1)^\beta \frac{d^{2\beta}}{dx^{2\beta} } \psi_s(x) &= \lambda_s q(x) \psi_s(x)\\ \frac{d^{k}}{dx^{k} } \psi_s(x)\bigg|_{x=0} &= \frac{d^{k}}{dx^{k} } \psi_s(x)\bigg|_{x=1} = 0, \quad k=\beta, \dots, 2\beta-1. \end{align*}
As a special case, when $\beta=1$ and $q(x)=1$ then the solution is the cosine-basis $$ \psi_k(t) = \sqrt{2} \cos(\pi k t), \qquad \lambda_k = (\pi k)^2. $$
Claim 3: The role of $\lambda_k$ is important, and $$ \lim_{s\to\infty}\lambda_s = [1+o(1)] (\pi s)^{2\beta} \left [ \int_0^1 q^{1/2\beta}(x) dx\right]. $$