On a decomposition of a conditional distribution

795 Views Asked by At

I am trying to make some sense out of equation (7) in the recent paper of Peter van Leeuwen: "Representation errors and retrievals in linear and nonlinear data assimilation"

http://onlinelibrary.wiley.com/doi/10.1002/qj.2464/abstract

The argument goes as follows. Suppose $y$ are the observations (taken in a modeled area $i$) that will be assimilated, $x$ denotes the value of the model state and $p(y|x)$ is the PDF of $y$ given that the model state is $x$, i.e., the solution of the data assimilation problem. The observation equation reads

$$y = H(x) + \epsilon$$

where $H$ is the observation operator and $\epsilon$ denotes the representation + measurement errors. $H(x)$ is that part of the model state vector that is related to observations in area $i$.

The author then defines the vector $z = H(x) + \tilde{z}$ in observation space, where $\tilde{z}$ is the high-resolution variation (in area $i$) at the location where observation $y$ was taken. Thus, $z$ is a vector with elements $H(x) + \tilde{z}$, where $H(x)$ is the same for all elements and $\tilde z$ varies from element to element.

Equation (7) then reads:

$$ p(y|x) = \int p(y|x,\tilde{z}) p(\tilde{z}|x) \mathrm{d} \tilde{z} = \int p(y|z) p(\tilde{z}|x) \mathrm{d}\tilde{z} . $$

How exactly is this equation derived? My probability theory is a bit rusty...

1

There are 1 best solutions below

1
On BEST ANSWER

A simpler example first to highlight the use of marginal distributions: \begin{align*} p(y)=\int p(y,\tilde{z})d\tilde{z}=\int p (y|\tilde{z})p (\tilde{z})d\tilde{z} \end{align*} http://en.wikipedia.org/wiki/Marginal_distribution

Marginal distributions interact with conditional distributions in the following way: \begin{align*} p(y|x)=\int p(y,\tilde{z}|x)d\tilde{z}=\int p (y|\tilde{z},x)p (\tilde{z}|x)d\tilde{z}. \end{align*} Joint probabilities, conditional probabilities with the chain rule.