I just read this:
For $u \in H^1(Q)$ where $Q=\cup_{t \in (0,T)}\Omega \times \{t\}$, we have that $$\int_{\Omega}u_tv +\nabla u \cdot \nabla v = \int_{\Omega}fv\quad\text{for all $v \in H^1(\Omega).$}\tag{1}$$
Now I want a bound on $u_t$ in $L^2(0,T;L^2(\Omega))$. The issue is we cannot take $v=u_t$ in (1) because $u_t \notin L^2(0,T;H^1(\Omega))$, it's only in $L^2(0,T;L^2(\Omega))$. So the author gets around this by the following argument.
From (1) it follows that $$u_t -\Delta u = f\quad\text{holds a.e. in $Q$.}$$ Now we can multiply this equation by $u_t$ and integrate over space, ....
This makes no sense to me. I know that the author has integrated by parts but $u$ is not smooth enough to have $-\Delta u$ make sense in a pointwise fashion; it only exists as a functional. So I don't understand why this a correct approach.
The source is Variational methods in the stefan problem by José-Francisco Rodrigues, in the proof of Proposition 4.7.
Note that $u \in H^1(Q)$ if $u \in L^2(0,T;H^1(\Omega))$ and $u_t \in L^2(0,T;L^2(\Omega))$.
So $\Delta u$ is still only defined in a weak sense, in that there is a distribution $w$ such that
$$ \int u \Delta \phi = \int w \phi $$
for all $\phi \in C_0^\infty$.
So, if
$$ \int_\Omega u_t \phi + \nabla u \cdot \nabla \phi = \int_\Omega f\phi $$
for all $\phi$, then integrating by parts
$$ \int_\Omega u_t \phi - u \Delta \phi = \int_\Omega f\phi $$
(there was a boundary term that vanished because $\phi$ is compactly supported), and then using the definition of the weak derivative
$$ \int_\Omega u_t \phi - w \phi = \int_\Omega(u_t - w)\phi = \int_\Omega f\phi $$ and so, as distributions, you must have $u_t - w = f$.
Then, if $f \in L^2$ (and if $u_t - w$ is locally integrable?), the two sides are equal as distributions if and only if they are equal to each other (as functions) almost everywhere. (stated here, with a reference to Hormander)