Probability of Improvement (PI) acquisition function for Bayesian Optimization

Question

Probability of Improvement (PI) acquisition function for Bayesian Optimization

364 Views Asked by Bumbble Comm At 04 Apr 2026 - 3:46

I was trying to better understand the intuition behind Probability of Improvement (PI) acquisition function in the field of Bayesian Optimization

This acquisition function chooses the next point as the one with highest probability of improvement over the current max $f(x^+)$, and we can write the selection of next point as follows:

$$x_{t+1} = arg\max (\alpha_{PI}(x)) = arg \max((P(f(x) \ge f(x^+)+\epsilon)))$$

where $x^+$ is the current optimum point.

Typically, having an initial dataset of available points and evaluations $\mathcal{D} = \{x_i, f(x_i)\}_{i=1}^n$ one uses a Gaussian Process regression to build a surrogate model for the black-box function $f$. Given this consideration, it is possible to rewrite the original equation as

$$x_{t+1} = arg\max_{x} \Phi\left(\frac{\mu_t(x) - f(x^+)-\epsilon}{\sigma_t(x)}\right)$$

with $\Phi(\cdot)$ indicating the CDF.

To me it's not yet clear how to get to the second equation by starting from the first one.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2021-08-23 14:42:01

You can get the result by a change of variables. Let's fix some notation.
At time $t+1$ you have observed $(x_1, y_1), \dots, (x_t, y_t)$ and you can build the functions $\mu_t$ and $\sigma_t$ (mean and variance) based on this data. You can also define $x^+=\text{arg}\max_{i=1,\dots, t}f(x_i)$. Notice that if there is no noise in your observations (which I believe to be your case), then $x^+$ does not depend on the chosen model $f$, as it coincides with an observed $y_i$, for some $i=1, \dots, t$.
Since the right hand side of $f(x) \geq f(x^+) + \varepsilon$ is fixed (not a random variable) $$\begin{aligned}\mathbb{P}(f(x) \geq f(x^+) + \varepsilon) &= \frac{1}{\sqrt{2 \pi\sigma_t(x)^2}} \int^{\infty}_{f(x^+) + \varepsilon}\exp\left (-\frac{(r-\mu_t(x))^2}{2\sigma_t(x)^2}\right)dr\\ &= \frac{1}{\sqrt{2 \pi}} \int^{\frac{\mu_t(x))^2 -f(x^+) - \varepsilon}{2\sigma_t(x)^2}}_{-\infty}\exp\left (-\frac{s^2}{2}\right)ds\end{aligned},$$ where the change of variables is $s= -\frac{r - \mu_t(x)}{\sigma_t(x)}$. The rightmost element in the chain of equalities is the cdf of the standard Gaussian evaluated at $\frac{\mu_t(x))^2 -f(x^+) - \varepsilon}{2\sigma_t(x)^2}.$

Probability of Improvement (PI) acquisition function for Bayesian Optimization

There are 1 best solutions below

Related Questions in OPTIMIZATION

Related Questions in BAYESIAN

Trending Questions

Popular # Hahtags

Popular Questions