In this paper - Variational Learning of Inducing Variables in Sparse Gaussian Processes
After equation (5), the statement:
Here, $p(\textbf{f}|\textbf{f}_m) = p(\textbf{f}|\textbf{f}_m, \textbf{y})$ is true since $\textbf{y}$ is a noisy version of $\textbf{f}$ and because of the assumption we made that any $\textbf{z}$ is conditional independent from $\textbf{f}$ given $\textbf{f}_m$
The above explanation given then references the footnotes, where it then goes on to explain:
From $p(\textbf{z}|\textbf{f}_m, \textbf{y}) = \frac{\int p(\textbf{y}|\textbf{f})p(\textbf{z}, \textbf{f}_m, \textbf{f}) d\textbf{f}}{\int p(\textbf{y}|\textbf{f})p(\textbf{z}, \textbf{f}_m, \textbf{f}) d\textbf{f}d\textbf{z}}$ and by using the fact $p(\textbf{z}|\textbf{f}_m, \textbf{f}) = p(\textbf{z}|\textbf{f}_m)$, the result follows.
My question is - how does the footnotes explain how the statement $p(\textbf{f}|\textbf{f}_m) = p(\textbf{f}|\textbf{f}_m, \textbf{y})$ is true? I am not able to see how the footnotes, which do not even include either term ($p(\textbf{f}|\textbf{f}_m)$ or $p(\textbf{f}|\textbf{f}_m, \textbf{y})$) is connected to the original statement.