Can the Laplace (saddle-point) approximation be applied to integrals of the form:
$$\int_0^1 \mathrm e^{M f(\mathbf x)} \mathrm \delta\left(\sum_i x_i - 1\right) d\mathbf x$$
where $M$ is a large real number.
Assume that the function $f(\mathbf x)$ is sufficiently smooth and has a non-degenerate maximum inside the simplex $0\le x_i \le 1, \sum_i x_i = 1$. Note however that the gradient $\nabla f$ at this constrained maximum need not be zero, it is only perpendicular to the simplex.
How can I proceed here? Thanks.
Note: A restricted version (and probably simpler) occurs when $f(\mathbf x) = \sum_i f_i(x_i)$, so the exponential part of the integrand factorizes. I am considering this variant in a separate question, Laplace method on a simplex of factorized integrand.
I don't have enough reputation to comment, but I'm working with a similar problem right now and found three papers that outline approaches to this problem in various settings.
MacKay describes the softmax parametrization on Laplace approximations in general, along with advantages and potential pitfalls, and I found it easy to follow:
Big idea: in a standard Dirichlet, if parameters are less than one than the posterior density can diverge at the boundary with some $p_i=0$. Since the Laplace approximation is only valid at a smooth peak, this breaks the approximation. One solution is to keep all parameters above 1, but this limits the model. The solution proposed in the paper is the probability vector in the Dirichlet being parametrized as $p_i(a) = \frac{\exp(a_i)}{\sum_{i'} \exp(a_{i'})},$ removing any problems with the boundary since this is a probability vector for any $a$. Note that $p_i(a+(k,\dots,k)^T) = p_i(a)$ for any $k\in\mathbb{R}$, which is related to the original degree of freedom in the term $\delta\left(\sum_i p_i\right)$ in the Dirichlet.
By looking at Jacobians, one can prove that these different parametrizations give the same density, and the author shows that the softmax parametrization is superior in many cases (although increasing $|a|$ shrinks the tails too much).
These next two papers have more detail in the context of Dirichlet regression, but contain more context-specific details to filter through: