How to get the optimal weight using Lagrange multiplier in weighted averaging used in ensemble learning?

162 Views Asked by At

In the Generalized Ensemble Method section of this paper, the authors gave an optimization problem, which is

\begin{equation} \min \sum_{i=1}^N\sum_{j=1}^N\alpha_i\alpha_jC_{ij} \\ \text{s.t.}\quad \sum_{i=1}^N \alpha_i=1 \\ \end{equation} By using Lagrange multiplier and setting derivatives to zero, we get \begin{equation} \sum_{j=1}^N\alpha_jC_{ij}=\lambda,~~~i=1,2,\dots,N \\ \sum_{i=1}^N \alpha_i=1 \end{equation}

Now my question is how to manipulate the above equations and get the solution $\alpha_{i}=\frac{\sum_{j} C_{i j}^{-1}}{\sum_{k} \sum_{j} C_{k j}^{-1}}$ ?

1

There are 1 best solutions below

0
On BEST ANSWER

Calling $\mathcal{1}= (1,\cdots, 1)$ the Lagrangian can be stated as

$$ L = \alpha^{\dagger}\cdot C\cdot \alpha + \lambda (\alpha\cdot \mathcal{1}-1) $$

and the stationary points are the solution to

$$ \nabla L = 2 C\cdot \alpha + \mathcal{1}\lambda = 0 $$

now assuming $C$ invertible

$$ \alpha^* = -\frac 12 C^{-1}\cdot \mathcal{1}\lambda $$

but

$$ (\alpha^*)^{\dagger}\cdot \mathcal{1} = 1 = -\frac 12\lambda \mathcal{1}^{\dagger}\cdot C^{-1}\cdot \mathcal{1} $$

then

$$ \lambda^* = -\left(\frac 12\mathcal{1}^{\dagger}\cdot C^{-1}\cdot \mathcal{1}\right)^{-1} $$

and finally

$$ \alpha^* = \frac{C^{-1}\cdot \mathcal{1}}{\mathcal{1}^{\dagger}\cdot C^{-1}\cdot \mathcal{1}} $$