Question
Suppose $(Y,X,U)$ be a random vector such that $$ Y = X'\beta + U. $$ Suppose $Y$ takes values in $\{0,1\}$ and that $E[Y\mid X] = X'\beta$. Is it reasonable to assume that $Var[U\mid X]$ does not depend on $X$?
Attempt
Note that if $Y = X'\beta$ exactly, such that $U=0$ a.s., then we have that $E[U\mid X] = 0 = E[U]$ and $Var(U\mid X) = Var(U) = 0$. However, in practice we cannot perfectly predict $E[Y\mid X]$. Now, in the case that $X' \beta$ does not perfectly predict the outcome, we have see that \begin{align*} Var(U\mid X) &= E[U^2\mid X] - E[U\mid X]^2 \\ &= E[(Y - X'\beta)^2\mid X] - E[Y - X'\beta\mid X]^2\\ &= E[(Y - X'\beta)^2\mid X] - (E[Y\mid X] - E[X'\beta\mid X])^2\\ &= E[(Y - X'\beta)^2\mid X] - (X'\beta - X'\beta)^2\\ &= E[(Y - X'\beta)^2\mid X]\\ &= E[Y^2 + (X'\beta)^2 - 2 Y X' \beta\mid X]\\ &= E[Y^2\mid X] + (X'\beta)^2 - 2 X' \beta E[Y\mid X]\\ &= E[Y^2\mid X] + (X'\beta)^2 - 2 (X'\beta)^2 \\ &= E[Y^2\mid X] - (X'\beta)^2 &\neq 0 \end{align*}
However, I cannot show that in the end that $E[Y^2\mid X] \neq (X'\beta)^2$. I think this will complete the proof? Is that correct? How can I show this in the end?
You cannot prove that $\operatorname{Var}(U\mid X) = \operatorname{Var}(U)$. In the set up you describe, this can only be assumed. If the set up is used to describe a real-world phenomenon, only then can you argue about whether it is a reasonable assumption or not.
For example, if your binary dependent variable is "$Y$=1: A country declares bankcruptcy", and $X $= "Level of country's Public Debt". Is it reasonable to assume that the error term is independent of $X$, and so the conditional variance equals the unconditional one? Rather not, since, your specification (which is problematic anyway - the linear probability model is not the best to use when the dependent variable is binary), suffers from "omitted-variable bias". For example, another important factor in predicting country-bankcrupcty is the country's Gross Domestic Product - and Gross Domestic Product is usually correlated with the level of Public Debt. So GDP will be essentially included in the "error term" and so the error term will likely be correlated with $X$, and then $\operatorname{Var}(U\mid X) \neq \operatorname{Var}(U)$.
You can now think of an example where the specification is adequate for the real-world phenomenon under study, and so the assumption could be viewed as valid.