Background. Consider a convex program $(P)$, \begin{aligned} \textrm{minimize } \quad & f_0(x) \\ \textrm{subject to } \quad &f_i(x) \le 0, \quad i=1,2,...,m \\ & Ax=b, \end{aligned}
and its dual program $(Q)$,
\begin{aligned} \textrm{maximize} \quad & g(\lambda, \nu)\\ \textrm{subject to} \quad & \lambda \succeq 0 \end{aligned}
where $g(\lambda, \nu)=\underset{x\in \mathcal D}{\inf} \left(f_0(x)+\sum_{i=1}^m \lambda_i f_i(x)+\nu^T (Ax-b)\right)$, and $\mathcal D\triangleq \left(\underset{i}{\cap} \: \mathrm{dom}\:f_i\right) \subset \mathbb{R}^n$. The Slater's condition implies strong duality, i.e. $p^*=d^*$, where $p^*$ and $d^*$ are the optimal value of $(P)$ and $(Q)$, respectively. (The Slater's condition is: There exists an $x\in \mathrm{relint}\:\mathcal D$ such that $Ax=b$ and $f_i(x)<0, \:i=1,2,...,m$.)
Boyd & Vandenberghe's book "Convex Optimization" proves the strong duality for a simplified case, i.e. when $\mathrm{relint}\:\mathcal D=\mathrm{int}\:\mathcal D$. Let's call this BV's Lemma.
Question. However, I had some difficulty extending it to the general case, when $\mathrm{relint}\:\mathcal D\ne\mathrm{int}\:\mathcal D$, which I suppose can happen only when the affine dimension of $\mathcal D$ is less than $n$. Does someone know how to prove this, or where I can find a proof? I'll also outline my attempt below, and would appreciate it if someone can point out flaws, if any, in the arguments or comment on/confirm my proof:
My attempted proof. If the affine dimension of $\mathcal D$ is less than $n$, then the affine hull of $\mathcal D$ can be expressed by $\{x\in \mathbb R^n:Bx=c\}$. So I suppose we can add $Bx=c$ to the equality constraints of the convex program, enlarge $\mathcal D$ so that its affine dimension becomes $n$, and obtain an equivalent convex program $(\hat P)$, but now with $\mathrm{relint}\:\mathcal{\hat D}=\mathrm{int}\:\mathcal{\hat D}$, where $\mathcal{\hat D}$ is the enlarged domain of $(\hat P)$. Then by BV's Lemma, strong duality holds for $(\hat P)$. As a result, letting $\hat p^*$ and $\hat d^*$ be the optimal value of $(\hat P)$ and its dual program, respectively, we have $p^*=\hat p^*=\hat d^*$.
Clearly, $p^*\ge d^*$ due to weak duality, and hence $\hat d^*\ge d^*$. But it appears that we also have $d^* \ge \hat d^*$. To see this, note that the dual function of $(\hat P)$ is:
$$\hat g(\lambda, \nu_A, \nu_B)=\underset{x\in \mathcal{\hat D}}{\inf} \left(f_0(x)+\sum_{i=1}^m \lambda_i f_i(x)+\nu_A^T (Ax-b)+\nu_B^T(Bx-c)\right),$$
Therefore, $g(\lambda, \nu) \ge \hat g(\lambda, \nu, \nu_B)$, since $\mathcal D \subset \mathcal{\hat D}$ and $Bx=c$ for any $x\in \mathcal D$. As a result,
$$d^*=\underset{\lambda \succeq 0}{\sup} g(\lambda, \nu) \ge \underset{\lambda \succeq 0}{\sup} \hat g(\lambda, \nu_A, \nu_B)=\hat d^*,$$
and hence $p^*=d^*$. QED.
Am I mistaken somewhere?