Find a solution $\hat{\varepsilon}$ of the following minimization problem \begin{align*} &\min_{\varepsilon \in \mathbb{R}^M} \sum_{h=1}^M \varepsilon^h \hat{R}^h+\beta \sum_{h=1}^M \varepsilon^h \log \left(\frac{\varepsilon^h}{\pi^h}\right)\\ \text{s.t }& \varepsilon^h \ge 0\\ & \sum_{h=1}^M \varepsilon^h = 1\\ & \sum_{h=1}^M \varepsilon^h \hat{B}^h \le \overline{B}, \end{align*}
where
- $\hat{B}^h = \hat{B}\left(\hat{g}^h\right)$ denotes for the budget to classify one instance.
- $\hat{R}^h = \hat{R}_n\left(\hat{g}^h\right) = \dfrac{1}{n}\sum_{i=1}^n \mathbb{1}_{\hat{g}^h(X_i)\neq Y_i}$ denotes for the risk on the training sample.
- $\pi = \left(\pi^1,\ldots,\pi^M\right)$ denotes for the prior distribution, where $\pi^h = \dfrac{\left(\hat{B}^h\right)^{-1}}{\sum_{j=1}^M (\hat{B}^j)^{-1}}$.
- $\overline{B} = B/T$ denotes for mean budget per instance.
I have tried to solve this problem as follows, but have not found the explicit form yet.
Let us consider the Lagrangian function of the problem \begin{align*} L(\varepsilon,\lambda,\mu,\gamma) & = \sum_{h=1}^M \varepsilon^h \hat{R}^h+\beta \sum_{h=1}^M \varepsilon^h \log \left(\frac{\varepsilon^h}{\pi^h}\right) - \sum_{h=1}^M \lambda_h \varepsilon^h + \mu \left(1-\sum_{h=1}^M \varepsilon^h \right) + \gamma\left(\sum_{h=1}^M \varepsilon^h \hat{B}^h - \overline{B}\right), \end{align*} where $\lambda_h \ge 0$, $\gamma \ge 0$. Hence, the KKT condition is \begin{align*} & \varepsilon^h \ge 0, \quad \forall h = \overline{1,M}\\ & \sum_{h=1}^M \varepsilon^h = 1\\ & \sum_{h=1}^M \varepsilon^h \hat{B}^h \le \overline{B}\\ & \lambda_h \ge 0,\forall h =\overline{1,M}\\ & \gamma \ge 0\\ & \lambda_h \varepsilon^h = 0\\ & \gamma \left(\sum_{h=1}^M\varepsilon^h \hat{B}^h - \overline{B}\right) = 0\\ & \dfrac{\partial L}{\partial \varepsilon^h} = \hat{R}^h + \beta \left(\log\left(\dfrac{\varepsilon^h}{\pi^h}\right) + 1\right) - \lambda_h - \mu + \gamma \hat{B}^h = 0\tag{2.} \end{align*} From the equation (2.), we have \begin{align*} &\log\left(\dfrac{\varepsilon^h}{\pi^h}\right) = \dfrac{\lambda_h + \mu - \lambda \hat{B}^h - \hat{R}^h}{\beta} -1 \\ \Leftrightarrow\ & \varepsilon^h = \pi^h \cdot \exp\left(\dfrac{\lambda_h + \mu - \gamma \hat{B}^h - \hat{R}^h - \beta}{\beta}\right) \end{align*}