Given a finite sequence $h = (h_0,h_1,h_2,\ldots,h_{2n-2})$, one can form the Hankel matrix
$$ H = \begin{bmatrix} h_0 & h_1 & h_2 & \cdots & h_{n-1} \\ h_1 & h_2 & h_3 & \cdots & h_n \\ h_2 & h_3 & h_4 & \cdots & h_{n+1} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ h_{n-1} & h_n & h_{n+1} & \cdots & h_{2n-2} \end{bmatrix}. $$
I know by example that there are interesting things one can learn about the sequence by forming this matrix. For instance, one can use a rank-revealing "Vandermonde decomposition" this matrix to represent this sequence as a sum of exponentials
$$ h_j = \sum_{i=1}^k c_i z_i^j, $$
I also know of natural examples of Hankel matrices that appear in applications (e.g., the Hilbert matrix, the Gram matrix of the monomials $1,x,\ldots,x^{n-1}$ in some weighted $L^2$ inner product space, etc.)
Despite these examples, I still feel like I don't have an answer in my bones to the following question:
Given a sequence $h$, why is it natural to form the Hankel matrix $H$? Phrased differently, in what way is it natural to identify a sequence $h$ with the operator $H$?
For highly related ways of converting a sequence to a matrix, I don't have these answers. If I rearrange $h$ into a Toeplitz matrix $T$, the linear transformation $T$ represents convolution with $h$. If I arrange $h$ into a circulant matrix $C$, the linear transformation $C$ represents a circular convolution with $h$.
I don't have a similar understanding for $H$ as a linear transformation. Multiplication by $H$ is a convolution operation of sorts, but it is "backwards". One answer to my question might be why this "backwards" convolution operator is a natural operation, making it natural to move from $h$ to $H$.
$\newcommand\L{L^2(\mathbb S^1)} \newcommand\M{\mathscr M_f} \newcommand\T{\mathscr T_f} \newcommand\Ha{\mathscr H_f} \newcommand\H{H^2} \newcommand\Hp{(\H)^\perp}$
I may be mistaken but I believe the first class of Hankel matrices considered by the German mathematician Hermann Hankel is inspired by the Hankel operators which are actually defined on an infinite dimensional Hilbert space. Infinite Hankel matrices have the form $A=(a_{n,m})_{n, m=0}^\infty $, where $a_{n, m}$ is a function of $n+m$.
One way to introduce Hankel matrices in a very natural way is to consider the multiplication operators on $\L $ associated to a measurable, bounded, complex valued function $f$ on the unit circle. That is, for every $\xi $ in $\L $ we let $$ \M (\xi )|_z = f(z)\xi (z), \quad \forall z\in \mathbb S^1, $$ so that $\M $ becomes a bounded operator on $\L $.
The matrix of $\M $ relative to the "Fourier basis" $\{e_n\}_{n\in {\mathbb Z}}$ of $\L $ (where each $e_n$ is defined by $e_n(z)=z^n$) is given by $$ a_{n,m}=\big \langle \M (e_m),e_n\big \rangle = \int_{\mathbb S^1} f(z)z^mz^{-n}\, dz = \int_{\mathbb S^1} f(z)z^{-(n-m)}\, dz = \widehat f(n-m). $$ Notice that the entries of this matrix depend on $n-m$, rather than $n+m$ as in the Hankel matrix case.
Well, matrices indexed by ${\mathbb Z}\times {\mathbb Z}$, whose entries $a_{n, m}$ depend on $n-m$, as above, are called Laurent matrices.
It is a well known result that for any Laurent matrix $A=(a_{n,m})_{n, m\in {\mathbb Z}}$ corresponding to a bounded operator on $\L $, there is a measurable, bounded function $f$ such that $$ a_{n,m}=\widehat f(n-m).\qquad\qquad (*) $$
Another important class of operators is formed by the Toepliz operators $\T $, also associated to measurable, bounded functions $f$, but this time $\T $ acts on the Hardy space $\H$, namely the subspace of $\L $ defined by $$ \H = \overline{\text{span}}\{e_n:n\geq 0\}. $$ Denoting by $P$ the orthogonal projection from $\L$ to $\H$, one may define $\T $ as $$ \T =P\M |_{\H}. $$ The matrix of $\T $ is also given by the expression $(*)$, except that now the indices $n$ and $m$ range in ${\mathbb N}$, rather then ${\mathbb Z}$. One may therefore see a Toepliz matrix as the bottom right corner of a Laurent matrix (as long as the rows are labled with indices increasing downwards).
An important result states that any matrix indexed by ${\mathbb N}\times {\mathbb N}$, whose entries $a_{n, m}$ depend on $n-m$, and which corresponds to a bounded operator on $\H$ is necessarily given by $(*)$, where $f$ is a measurable, bounded function.
Yet another operator "embedded" in the multiplication operator $\M $ is the so called Hankel operator $$ \Ha:\Hp \to \H $$ given by $\Ha=P\M |_{\Hp}$. Its matrix is again given by $(*)$, with the difference that its columns are now indexed by the negative integers, corresponding to the basis $$ \{e_n:n<0\} $$ of its domain $\Hp$, while the rows are indexed by the natural numbers.
The matrix of a Hankel operator is therefore the bottom left corner of the corresponding Laurent matrix.
Contrary to the Laurent and Toepliz cases, a Hankel operator is not an operator from a Hilbert space to itself, as its domain is perpendicular to its range. Nevertheless, $\H$ is obviously isomorphic to $\Hp $, e.g. by the map sending each $e_m$ to $e_{-m-1}$, for every $m<0$. Identifying $\H $ and $\Hp$ via this isomorphism $\Ha$ becomes an operator on $\H$, whose matrix is now indexed by ${\mathbb N}\times {\mathbb N}$, and each entry, suitably modified by the change of coordinates $m\rightarrow -m-1$, becomes $$ b_{n,m}= a_{n, -m-1} = \widehat f(n+m+1). \qquad\qquad (\dagger) $$
We therefore finally arrive at what I believe is the most natural introduction of the notion of Hankel matrices!
Observe that, while the entries of Laurent and Toepliz matrices include (with lots of repetitions) all of the Fourier coefficients of the associated measurable function $f$, the same is no longer true with respect to $(\dagger)$, as only positively indexed Fourier coefficients are used. This makes the characterization of bounded Hankel matrices a bit trickier but still formally the same as before: Nehari's Theorem asserts that a Hankel matrix represents a bounded operator on $\H$ if and only if there exists a measurable bounded function $f$ such that $(\dagger)$ holds.
Nonzero Laurent and Toepliz operators are never compact, but of course Hankel operators can be. E.g., if $f$ is a trigonometrical polynomial, it is clear that $\Ha$ is a finite rank operator.
Compact Hankel operators were characterized by Hartmann in a result that states that a Hankel matrix represents a compact operator on $\H$ if and only if there exists a continuous function $f$ such that $(\dagger)$ holds.
I may be admonished by Stack Exchange users for expressing my personal opinion but nevertheless let me say that the process of arbitrarily modifying a linear transformation between two different Hilbert spaces (for which no spectral theory is meaningful) into an operator from a Hilbert space into itself, must have raised the wrath of the gods leading to the terrible behaviour of the spectral properties of Hankel matrices!