Definition. A function $f:\mathbb R^d \rightarrow [0,1]$ is called a unimodular density function if
- $\int_{\mathbb R^d}f(x)dx=1$.
- $f$ is quasiconcave.
- $f$ has a unique maximum point $m_f \in \mathbb R^d$ called the mode of $f$.
Let $f_1,f_2:\mathbb R^n \rightarrow [0, 1]$ be unimodular density functions. For $v \in \mathbb R^d$, let $$ I(v) := \int_{\mathbb R^d}\min(f_1(x),f_2(x-v))dx = \int_{\mathbb R^d}\min(f_1(x-v),f_2(x))dx, $$ be the area enclosed by the densities $f_1$, $f_2(\cdot-v)$ and the coordinate axes in $\mathbb R^d$.
Question 1. For $r \ge 0$, compute / describe the maximizers $v^*$ of $I(v)$ for $v \in \mathbb B_d(0; r) := \{ v \in \mathbb R^d \mid \|v\| \le r\}$.
Question 2. Same question with the additional informaiton that $r < \|m_{f_1}-m_{f_2}\|$.
What I'm after. Of crouse, such questions will certainly not have a definitive answers in general, so I'm happy to hear interesting ideas on the problem (in the form of comments, or full posts, which will be duely upvoted, of course). For example, something which reduces the search for the optimizers $v^*$ to a tractable convex optimization problem (e.g projection onto a simple convex set, etc.) will already be helpful. One may assume some moment conditions on the $f_j$'s (e.g finite variance, sub-Gaussian, etc.), and then outline a solution concept / strategy in that case.
Statistical intepretation
Let $P_j$ be the probability distribution on $\mathbb R^d$ with density $f_j$ and for $v \in \mathbb R^d$, let $P_j - v$ be the probability distribution with density $f_j(\cdot - v)$. That is, if $X_j$ is a random variable with distribution $P_j$, then $P_j - v$ is the distribution of $X-v$. Then, the total-variation distance between $P_1$ and $P_2-v$ is given by $$ TV(P_1, P_2-v) = 1 - I(v) . $$
Thus, Questions 1 & 2 ask to find a global translation $v \in \mathbb R^d$ with norm at most $r$ such that the total variation distance between $P_1$ and $P_2 - v$ is minimized.
Special cases
Let me present a few worked examples to get the ball rolling. Let $\Delta := m_2-m_1 \in \mathbb R^d$.
Gaussians of same shape
If $f_j$ is the density of $d$-dimensional Gaussian with mean $m_j$ and covariance $\Sigma$, then one can show that
$$ TV(P_1,P_2-v) = TV(\mathcal N(m_1,\Sigma),\mathcal N(m_2-v,\sigma)) = 2\Phi(\|v-\Delta\|_{\Sigma^{-1}}/2)-1, $$
Thus, $\arg\max_{\|v\| \le r}I(v) = \arg\min_{\|v\| \le r} \|v-\Delta\|_{\Sigma^{-1}}$, which can be identified with projection onto an ellipsoid. If in particular the covariance matrix is diagonal, then things are reduced to projecting onto an $L_2$ ball, and the optimal translation is given analytically by $v^* = \dfrac{r\Delta}{\max(r, \|\Delta\|)}$.
Exponential families
Similar analysis as above. Problem can be approximated with KL divergence using Pinker's inequality, and then KL can be computed analytically for expo families), and then computation of $v^*$ is reduced to a tractable convex optimization problem.
Symmetric 1D unimodular distributions of shape "shape" and "scale"
If $F$ is the CDF for the centralized distribution $P_1-m_1$, then a simple symmetry argument reveals that $TV(P_1,P_1-v) = 2F(|v-\Delta|)-1$ and so $v^* = \arg\max_{|v| \le r}2F(|v-\Delta|) - 1$ It's easy to compute that $v^* = \dfrac{r\Delta}{\max(r, |\Delta|)}$.
Location-scale families
Similar to previous case...