The author writes Given a vector of inputs $X^T = (X_1, \dots ,X_p)$, we can predict an output $Y$ via $$ \hat{Y} = \beta_0 + \sum_{j = 1}^p X_j \beta_j$$
He goes on to note that if we include a 1 in the vector $X$, making it a $p+1$ dimensional vector, we can write the regression as an inner product $\hat{Y} = X^T\beta$, and this is a subspace. If the constant is not included, it forms an affine set.
I'm not convinced that $X^T\beta$ forms a subspace of $X^T \in \mathbb{R}^{p+1}$, mostly because $\vec{0} \not \in X^T\beta$. Unless the author thinks that $X^T\beta$ is a subspace in beta, in which case I agree.
Maybe someone could clarify for me. The book is Elements of Statistical Learning, page 12.