Linear regression with $n+m$ parameters

80 Views Asked by At

I have the following optimisation problem: $$\min_{\substack{ \theta_1,\ldots,\theta_n \\ \beta_1,\ldots,\beta_m}}\quad \sum_{t=1}^T \frac{1}{2}(\boldsymbol{\theta}_t^\top \boldsymbol{x}_t + \boldsymbol{\beta}_t^\top \boldsymbol{z}_t-r_t)^2 +\frac{1}{2}\sum_i \|\boldsymbol{\theta}_i\|^2+\frac{1}{2}\sum_j \|\boldsymbol{\beta}_j\|^2$$

where $\theta_t\in \{\theta_1,\ldots,\theta_n\}$ and $\beta_t \in \{\beta_1,\ldots,\beta_m\}$.

I would like to solve this problem in the primal mode, however, the parameters are dependent on each other that plug one in the another does not help. I tried stacking parameters in different ways also, but either it is not possible, or it is very costly in terms of memory. What do you suggest to try?

1

There are 1 best solutions below

4
On BEST ANSWER

You can use Alternating Minimisation to optimise this objective function. You are optimising two vectors $\theta$ and $\beta$ so you can alternate between solving each variable vector while keeping the other vector constant.

Let $f(\theta_t, \beta_t)$ be your objective function. You can follow these steps to optimise it:

  1. Assign $\theta_0$ and $\beta_0$ random values from say a uniform distribution between 0 and 1.

  2. Fixing $\beta_{t-1}$, solve for $\theta_{t}$ as $$\theta_{t} = \arg \min_{\theta_t} f(\theta_t, \beta_{t-1})$$You can use gradient descent for this optimisation.

  3. Similarly, fixing $\theta_{t}$, solve for $\beta_{t}$ as $$\beta_{t} =\arg \min_{\beta_{t}} f(\theta_t, \beta_{t})$$

  4. Repeat steps 2 and 3 until the objective function no longer decreases by some epsilon. As a sanity check, the objective function should never increase between consecutive iterations.

Alternating Minimisation (AM) is a popular method that is used for objective functions where dependencies between variables are tight. Matrix completion is one example where AM is efficient, See Algorithm 1 in this paper as it gives an idea on solving such objective functions.