How to implement the following problem in Matrix form? (Linear Algebra and equations on Matrix form)

187 Views Asked by At

How to implement the following problem \eqref{Eq. 1} in Matrix form? (Linear Algebra and equations on Matrix form)

I am trying to solve the following optimization problem (coming from this another question, if more details are needed):

Let say we have $N$ random variables $x_i$, each with mean $\mu_i$, variance $\sigma_i^2$, and among variables the correlation coefficient is $\rho_{ij}$ when $i\neq j$.

Lets consider also some indicator variables $a_i \in \{0;\ 1\}$ telling if a random variable $x_i$ it is considered within the mix or not, so the total number of variables considered will be $M = \sum\limits_{i=1}^N a_i$.

I want to find the combination of random variables considering subsets, that will maximize the following:

$$\underset{\vec{a}}{\text{max}} \left\{ \nu(\vec{a},\vec{x}) := \frac{1}{M}\sum_{i=1}^{N} a_i\left(\mu_i-\frac{\sigma_i^2}{2}\right)-\frac{1}{M^2}\left(\sum_{i=1}^{N} a_i\sigma_i^2 + 2\sum_{1\leq i < j \leq N}a_ia_j\sigma_i\sigma_j\rho_{ij}\right)\right\} \\ \text{s.t.} \quad a_i \in \{0;\ 1\} \tag{Eq. 1}\label{Eq. 1}$$

Since I have to check every $2^N$ alternative, I could make an index $k$ such starts at zero and increase by one up to $2^N$, and test as the vector $\vec{a}$ a vector of size $N$ with each component given by the binary description of the number $k$. The problem of this idea is, if I try to implement it on Excel, I will need at least $2^N$ files for each vector, and just for a small number of random variables like $N=20$ I will require $2^{20} = 1,048,576$ files: the size of a whole Excel spreadsheet tab!!!

So thinking in a less memory extensive way, noting the variables $x_i$'s parameters is given data, I think that a matrix description would be more useful, for then checking each instance $k$ in a "for"-loop (like in Matlab/Octave language since I could use it for free), which work fast with matrix operations.

But I am a bit rusty on linear algebra: the second term is $\text{Var}\left[\sum_{i=1}^N a_i x_i\right]$, now if $\mathbb{\Sigma}$ is the Covariance Matrix of $\vec{x}$, if I am not mistaken I will have that $\text{Var}\left[\sum_{i=1}^N a_i x_i\right] \equiv a^T\mathbb{\Sigma}a$ (using here that since $a_i \in \{0;\ 1\}$ then $a_i^2 \equiv a_i$), and also that the right-side sum is a dot product of $\vec{a}\cdot\vec{b}$ where I think is simpler to just make directly each component $b_i = \mu_i-\frac{\sigma_i^2}{2}$ (I don't know how to implement it as a matrix operator), and $M := \sum_{i=1}^N a_i = \vec{a}\cdot\vec{a}$.

I hope you could check if these matrix implementation is right and/or give a better/proper expression, please without using fancy things like Hadamard products, just simple Matrix operations that could be implemented with dot products, matrix multiplication ("*" in Octave), and fast built-in functions in the free version of Online Octave (think I aim to solve the maximization using a "for" loop).

1

There are 1 best solutions below

10
On BEST ANSWER

so let's try breaking this down Given the following: $\vec{a}$ is a binary vector of size $N$ representing the selection of variables. $\vec{\mu}$ is the vector of means of size $N$ ($\mu_i$ for $i = 1$ to $N$). $\vec{\sigma^2}$ is the vector of variances of size $N$ ($\sigma_i^2$ for $i = 1$ to $N$). $\mathbb{\Sigma}$ is the covariance matrix of size $N \times N$ for random variables $\vec{x}$.

The dot ($\cdot$) denotes the dot product operation.

To find the optimal $\vec{a}$ that maximizes this objective function, you can iterate through all possible binary vectors $\vec{a}$ (from $0$ to $2^N - 1$), compute the objective function value for each $\vec{a}$, and track the maximum value and the corresponding $\vec{a}$ that yields it.

$\vec{a} \cdot \vec{\mu}$ computes the dot product between the binary vector $\vec{a}$ and the vector of means $\vec{\mu}$.

  • $\vec{a} \cdot \vec{\sigma^2}$ computes the dot product between $\vec{a}$ and the vector of variances $\vec{\sigma^2}$.
  • $\vec{a}^T \cdot \mathbb{\Sigma} \cdot \vec{a}$ computes the quadratic form $\vec{a}^T \cdot \mathbb{\Sigma} \cdot \vec{a}$, where $\mathbb{\Sigma}$ is the covariance matrix.

The term $M$ represents the total number of variables considered, which is the sum of elements in $\vec{a}$, i.e., $M = \sum_{i=1}^{N} a_i$.

anyway, you asked about getting the vector thru matrix operations in a comment yes, you can.

Given $\mathbb{\Sigma}$, which is the covariance matrix of size $N \times N$ for random variables $\vec{x}$:

The diagonal elements of the covariance matrix $\mathbb{\Sigma}$ contain the variances of the individual random variables. Thus, the vector of variances $\vec{\sigma^2}$ is essentially the diagonal elements of $\mathbb{\Sigma}$.

In mathematical terms:

If $\mathbb{\Sigma} = \begin{bmatrix} \sigma_{1}^2 & \text{cov}(x_1, x_2) & \ldots & \text{cov}(x_1, x_N) \\ \text{cov}(x_2, x_1) & \sigma_{2}^2 & \ldots & \text{cov}(x_2, x_N) \\ \vdots & \vdots & \ddots & \vdots \\ \text{cov}(x_N, x_1) & \text{cov}(x_N, x_2) & \ldots & \sigma_{N}^2 \end{bmatrix}$

Then, the vector $\vec{\sigma^2}$ can be obtained from the diagonal elements of $\mathbb{\Sigma}$:

$\vec{\sigma^2} = \begin{bmatrix} \sigma_{1}^2 \\ \sigma_{2}^2 \\ \vdots \\ \sigma_{N}^2 \end{bmatrix}$

Where the correct representation of the objective function in matrix form would be:

$$ \nu = \frac{1}{M} \left( \vec{a}^T \left[\vec{\mu} - \frac{1}{2} \vec{\sigma^2}\right] \right) - \frac{1}{M^2} \left( \vec{a}^T \mathbb{\Sigma} \vec{a} \right) $$ with $M = \vec{a}^T\vec{a}$