An optimization problem involving a probability density function

515 Views Asked by At

I have three time-series $\mathbf{x}_{1}, \mathbf{x}_{2}, \mathbf{x}_{3}$. I would like to find a linear combination of the time series, that is, some scalars $a_{1},a_{2},a_{3}$ such that the sum $$\sum_{i=1}^{3} a_{i}\cdot\mathbf{x}_{i}$$ has some desired properties. Specifically, I have a probability density function $f_{X}(x)$. I would like to find such $a_{i}$ that the linear combination has an empirical distribution as close as to $f_{X}$ as possible.

How should I got about this? I guess I could calculate the histogram of the linear combination, and take the (squared) difference to $f_{X}$ at some number of predefined bins/points. I would then minimize that difference, and restrict the $a_{i}$'s from going to infinity. I guess I could set a constraint such as $\sum a_{i} = 1$.

Would there be a better way to formulate this? I would like get a relatively simple equation for the optimization problem (which the histogram thing does not produce), so that I could try analyze its behaviour.

1

There are 1 best solutions below

0
On

You may have already crossposted this to Cross Validated, but if not...

For reference, there is actually a metric for 'distances' between probability distributions - the Kullback-Leibler divergence aka the relative entropy between two distributions $P$ and $Q$ is defined as $$D(P\|Q) := E_P[\log \frac{P(X)}{Q(X)}] = \sum_x P(x)\log \frac{P(x)}{Q(x)}.$$ As you may notice, this is just the expected log likelihood ratio between the two distributions. Most stuff you would want to do can probably be formulated using this tool.

It is not entirely clear if you want to scale samples of an acquired sample sequence such that it mimics $f_X(x)$, or if you want to construct a new distribution, or what. I can edit the answer (if the question isn't out of date) and clarify further.