In an article on Selection Bias in A/B Testing, AirBnB proposes a solution to their own estimated bias. This solution is to subtract from the aggregated effect a bias estimate, captured by this equation: $$ \hat{\beta} = \sum_{i=1}^n W_i \phi \left( \frac{W_i b_i - X_i}{W_i} \right) $$ where
$X_1, \ldots, X_n$ are random variables defined on a same probability space, and each $X_i$ follows a distribution with finite mean $a_i$ and finite variance $\sigma_i^2$ (the distributions are not necessarily identical.) We regard $a_i$ as the unknown true effect and usually estimate it by the unbiased estimate $X_i$;
$b_i$ is the cut-off from the reference distribution for significance level $\alpha_i$, usually set at $0.05$; and
$W_i$ is the estimated standard deviation of $X_i$, to define the bias estimate
In this context, what does $\phi$ stand for? My current understanding (by separating the two terms inside the parenthesis into $$ b_i - X_i / W_i $$ is that they're calculating the individual bias estimates as the difference between the cutoff and "how many" standard deviations fit into the estimate. Then they're adding these up, but I don't know what $W_i\phi$ is doing inside the sum.
According to their paper, $\phi$ refers to the density function of standard normal distribution.
Let me include some details:
They use the following result:
Assuming $X_i$ follows normal distribution $N(a_i, \sigma_i^2)$. \begin{align} &\beta = E[S_A-T_A] \\&=\sum_{i=1}^nE[I((X_i-a_i)>(b_i\sigma_i-a_i))(X_i-a_i)]\\ &= \sum_{i=1}^n \sigma_i E\left[I\left(\frac{X_i-a_i}{\sigma_i}>\frac{b_i\sigma_i-a_i}{\sigma_i}\right)\frac{X_i-a_i}{\sigma_i}\right]\\ &=\sum_{i=1}^n \sigma_iP\left(\frac{X_i-a_i}{\sigma_i}>\frac{b_i\sigma_i-a_i}{\sigma_i}\right)E\left[\frac{X_i-a_i}{\sigma_i}\left|\frac{X_i-a_i}{\sigma}>\frac{b_i\sigma_i-a_i}{\sigma}\right.\right]\\ &=\sum_{i=1}^n \sigma_i P\left(\frac{X_i-a_i}{\sigma_i}>\frac{b_i\sigma_i-a_i}{\sigma_i}\right)\frac{\phi\left( \frac{b_i\sigma_i-a_i}{\sigma_i}\right)}{P\left(\frac{X_i-a_i}{\sigma_i}>\frac{b_i\sigma_i-a_i}{\sigma_i}\right)} \\ &=\sum_{i=1}^n \sigma_i \phi\left( \frac{b_i\sigma_i-a_i}{\sigma_i}\right)\\ \end{align}
They then replace some parameters via estimation.