What does a bar "|" between Expectation subscripts mean?

Question

What does a bar "|" between Expectation subscripts mean?

140 Views Asked by Bumbble Comm At 11 May 2026 - 1:37

I've seen this notation in the SHAP paper, which extends Shapley values to Machine Learning models to give a form of local explanation.

In the paper, on page 5, the author uses the following notation:

$$z_{\bar S} \space | \space z_S$$

where $z$ is a vector of features for a model, $S$ indicates the set of features included in the model, $\bar S$ is the complement set of features and not included in the model, and $z_S$ is the feature vector that has values for the features in $S$ only and missing features otherwise. Likewise, $z_{\bar S}$ is a feature vector for all the features not included in the model and missing features otherwise.

This style of notation is used to change a conditional expectation value into a different form:

$$\begin{align} E[f(z) \space | \space z_S] &= E_{z_{\bar S} | z_S}[f(z)] \\ &\approx E_{z_{\bar S}}[f(z)] \end{align}$$

where $f$ is the model, and $f(z)$ is the model's prediction for input vector $z$. The author states you can get to the second line from the first by assuming independence between the features.

What does this notation mean? To me it reads as $z_{\bar S}$ given $z_S$, but wouldn't that make the notation superfluous? How can there be a $z_{\bar S}$ without a $z_S$?

Also, I don't see how the notation allows me to make changes to the conditional probability equation.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

If you read page 5 of the paper you'll find the following section that explains the purpose of $E[f(x) \space | \space z_S]$:

Implicit in this definition of SHAP values is a simplified input mapping, $h_x(z')$ = $z_S$, where $z_S$ has missing values for features not in the set $S$. Since most models cannot handle arbitrary patterns of missing input values, we approximate $f(z_S)$ with $E[f(z) \space | \space z_S]$.

This tells you that because models can't remove input features, we get the expected value of $f(z)$ conditional on the features $z_S$ being chosen.

Using this, let's look at the first line of your second equation:

$$E[f(z) \space | \space z_S] = E_{z_{\bar S} \space | \space z_S}[f(z)]$$

this means that the conditional expectation is equal to the expected value of $f(z)$ across the marginal distribution of the features $z_{\bar S}$ conditional on the features $z_S$.

In other words, we want to get the average value of the $f(z)$ given our chosen feature values and the distributions of the features $z_{\bar S}$. We say that $z_{\bar S}$ is conditional on $z_S$ because the distributions could/ will change depending on the chosen feature values $z_S$. The changes are due to the correlations or dependencies between the features, which make different feature value combinations more likely than others. That is the meaning of the notation:

$$z_{\bar S} \space | \space z_S$$

To explain the move from the first to the second line. If we assume feature independence, we no longer need to condition $z_{\bar S}$ on the chosen features $z_S$, as the distribution will not change with $z_S$. Making the $| \space z_S$ notation unnecessary.

In practice, this assumption is unlikely to be correct, but presumably the authors have shown it is good enough to calculate the Shapley values with reasonable accuracy.

What does a bar "|" between Expectation subscripts mean?

There are 1 best solutions below

Related Questions in PROBABILITY-DISTRIBUTIONS

Related Questions in EXPECTED-VALUE

Related Questions in CONDITIONAL-PROBABILITY

Related Questions in CONDITIONAL-EXPECTATION

Related Questions in MARGINAL-DISTRIBUTION

Trending Questions

Popular # Hahtags

Popular Questions