If i have a function $f : \Omega \subset \mathbb{R}^d \rightarrow \mathbb{R}$ which i want to approximate by a function of the form $F_k(x) = \sum_{i=1}^k a_i N(\mu_i,\sigma_i)(x)$, for some constants $a_i \in \mathbb{R}$, and where $N(\mu, \sigma)$ is the $d$-dimensional normal distribution with mean $\mu \in \mathbb{R}^d$ and variance $\sigma^2 \in \mathbb{R}^{d\times d}$, how would i go about doing this?
Or in other words, how to find $(a_1,...,a_k,\mu_1,...,\mu_k,\sigma_1,...,\sigma_k)$ which minimizes $||f - F_k||$ in some appropriate norm? I guess the general question would be how to project a function onto a nonlinear subspace?
You can use for example K-means clustering or maybe if you want something more powerful you can use Expectation Maximization (EM).
You can also homebrew your own iterated reweighted linear least squares solution which mimics any of these two depending on choice of reweighting functions.