I came across a problem and wanted some relevant references. Any pointer is highly appreciated. Simply put, I want to approximate a complicated density function $f^{\ast} (x)$ by some convex combination of a given set of easy density functions $f_1(x)$, $f_2(x)$, ..., $f_n(x)$. I thought about a non-negative least-squares method but not sure how good it is (or whether it is even a valid method). The problem seems related to density approximation, regression, and (maybe) kernel estimation.
The problem is as follows:
- There is a density function $f^{\ast} (x)$ that I wish to generate some random samples from. But $f^{\ast}(x)$ is not trivial: I can calculate the value $f^{\ast}(x)$ for any $x$ but I can't generate random samples from it. As a compromise, I wish to generate samples from some "good" approximation of $f^{\ast}(x)$.
- There is a set of density functions, i.e., $f_1(x)$, $f_2(x)$, ..., $f_n(x)$, from which I know how to generate random samples. For example, they can be $n$ normal pdfs with different (but fixed) means and variances.
Based on the above, I feel that I can find a "good" approximation of $f^{\ast} (x)$ in the form of a convex combination of $f_i(x)$. Specifically, I want to approximate $f^{\ast} (x)$ by $$ f_{\beta}(x) = \sum_{i=1}^{n} \beta_i * f_i(x) $$ where the weights $\beta_i \gt 0$ and $\sum_{i=1}^{n} \beta_i$.
I thought of the following method to estimate $\beta_i$ using non-negative least squares (NNLS):
- Generate $x_1$, $x_2$, $x_3$, ..., $x_k$ from the equally-weighted mixture $\bar{f} = \frac{1}{n}\sum_{i=1}^{n} f_i(x)$. I can generate samples from this mixture distribution because I know how to sample from each $f_i(x)$.
- Perform a non-negative least squares regression to find $$\beta^{\ast} = \arg\min_{\beta_i \geq 0, \sum_{i=1}^{n} \beta_i = 1} \sum_{j=1}^{k} \left(f^{\ast} (x_j) - \sum_{i=1}^{n} \beta_i f_i(x_j) \right)^2. $$ There are standard libraries (say in R or Python) to solve the above NNLS problem.
It seems then $f_{\beta^{\ast}}(x) = \sum_{i=1}^{n} \beta_i^{\ast} f_i(x)$ is a "good" approximation of $f^{\ast}(x)$. Moreover, I can actually generate random samples from $f_{\beta^{\ast}}(x)$ even if I can't generate samples from $f^{\ast}(x)$.
I want to know if there are any known properties for the above approximation scheme. For example:
- Is the above scheme actually valid? Is there any optimality statement can we say about $\beta_i^{\ast}$ and $f_{\beta^{\ast}}(x)$?
- For a given set of component densities $f_i(x)$, $i=1,\ldots,n$, is there a limiting approximation error between $f_{\beta^{\ast}}(x)$ and $f^{\ast}(x)$, measured by some norms?
- If the component functions $f_i(x)$ are from a certain family (say normal pdfs with unit variance and means between $-\infty$ and $\infty$), can the above limiting approximation error converge to zero when the number of component functions increases?