Given
$G(s)=\frac{\displaystyle\sum\limits_{k=0}^{m-1} b_ks^k}{\displaystyle\sum\limits_{j=0}^{n-1} a_js^j}$
with known positive integers $m,n$ and known real coefficients $b_k, a_j$, and moreover
$H(s)=\frac{\displaystyle\sum\limits_{k=0}^{p-1} c_ke^{ksT}}{\displaystyle\sum\limits_{j=0}^{q-1} d_je^{jsT}}$
with known positive integers $p,q$, known real value T, and unknown real coefficients $c_k, d_j$. Is it possible to find a least-square approximation of G(s) by H(s) on a compact interval of the imaginary axis ($s=i\omega$)
$\int\limits_{\omega_1}^{\omega_2} |G(i\omega)-H(i\omega)|^2 d\omega= min$
with a finite algorithm (i.e. does it reduce to a linear equation system by means of some clever trick), or is it a general nonlinear problem, all with iterative solution, ambiguity and such?
More specifically if $G(s)$ is sampled on an interval of the imaginary axis (finite Fourier series, from measurements) and least-square approximation is demanded only for the samples, does this simplify anything?
As far as I can see, the latter could also be stated slightly differently: is it possible to approximate a Fourier series by the quotient of two Fourier series with presumably much lower orders, by means of some finite algorithm?
After thinking about this a little deeper, I have found a solution - not of the original problem, to be fair, but of a problem that is similar with respect to the underlying approximation goal.
If $G(s)$ is approximated by
$H(s)=\frac{U(s)}{V(s)}$
(on the imaginary axis) then for many practical cases (I wish I had the time and mathematical discipline to clarify which cases these are...) will also $V(s)\cdot G(s)$ be approximated by $U(s)$. So if I modify the original problem to finding a minimum of the functional
$S[U,V]:=\displaystyle\int_{-\omega_0}^{\omega_0} |V(i\omega)G(i\omega)-U(i\omega)|^2 d\omega$
then setting the partial derivatives of $S$ to zero reduces to a usual linear least square problem, the solution of which is equivalent to an inhomogeneous matrix equation. The only thing that is needed to guarantee the uniqueness of the solution is to normalize the fraction that is $H$ by setting one coefficient to a known constant, for example $d_0=1$.
It turns out that the coefficients of the matrix equation are just one sort or the other of a discrete Fourier transform of $G(s)$ and $|G(s)|^2$. These amount to the impulse response of the "transfer function" $G(s)$ and its autocorrelation function.
I already did some numeric testing and the method (I don't believe that I am the first one to discover it) seems pretty stable for usual filter theory transfer functions (like Butterworth) and physical transfer functions (like one mass spring damper).
The only problem I have identified is, that the polynomials in $H(s)$ do sometimes contain roots that make the digital filter $H(s)$ unstable although the analog filter $G(s)$ is stable. That is the general problem of the sensitivity of roots from the polynomial coefficients. Since the least square method constrains the polynomial coefficients and not the roots, the roots can be pretty off, numerically.
It was impressive for me to see that relative coefficient changes of 1e-12 can result in changes of root magnitudes by several % (from slightly below 1 to slightly above 1). For the general topic of root instability, see https://en.wikipedia.org/wiki/Wilkinson%27s_polynomial
Sorry for my colloquial style, but I just wanted to quickly share the principle with those who are interested in the original question. The mathematics behind all that is really trivial for the most part.