Intuition behind the Eigenvectors of different matrices, when applied to compression and reconstruction.

141 Views Asked by At

In spectral mesh processing we often transform a 3D mesh from coordinate space (i.e, x y z) to an Eigen basis space. The original mesh can be reconstructed using a simple linear transformation using the Eigen vectors.

You can find the relevant information, if needed, in section 1.3 of the following conference course: https://www.cse.iitb.ac.in/~cs749/spr2017/handouts/levy_spectral.pdf

The Eigen vectors can come from any symmetric matrix that describes the mesh - adjacency matrix of the mesh for example. If it helps, you can think of the mesh as just a graph with vertices and edges, and the same theory applies.

Just as done for image compression, this helps because you can use just a few of the lowest eigen value components to reconstruct the original mesh. In the case of a 3D mesh, the Eigen vectors used for this, are the Eigen vectors of the Laplacian of the Mesh. Laplace-Beltrami operator (or matrix) is a better variation of the raw Laplacian. Fundamentally, any semi-definite matrix describing the mesh can be used to obtain the Eigen basis for compression/reconstruction.

However, different matrices give different performance. For example, typically you'll need fewer components to reconstruct if you use Laplace-Beltrami matrix rather than the Laplacian. A simple adjacency matrix might perform even worse, requiring more components for reconstruction. Another type of matrix used is the Hessian of some energy function computed over the mesh.

My question is, what is the geometric intuition behind some operators/matrices being better suited than others? The same can be asked for image reconstruction.

I'd like to know any resources or examples that explain this - what is the difference, intuitively, between the Eigen Bases constructed from different image or mesh operators?

1

There are 1 best solutions below

0
On

I'm sorry for sweeping some things under the rug in what follows and for a not-so- intuitive "intuitive explanation". I only really know anything about compression so I'll focus my attention there. I also think there's about a 5% chance I'm not addressing what you asked about, but here goes nothing!

Compression (generally) deals with random signals. Consider the problem of trying to compress many IID (jointly) Gaussian random vectors with zero mean and some covariance matrix. In lossy compression, you generally try to compress to as few bits as possible subject to a distortion constraint. Intuitively, allowing more distortion generally allows you to compress to fewer bits. The study of lossy compression is also known as "rate-distortion theory" because of the focus on this tradeoff.

In the gaussian compression setting, the canonical distortion measure is mean squared error (the expected value of the squared euclidean distance between your source (input) and reconstruction (decompressed output)). It turns out that the optimal tradeoff is (relatively) easy to describe if the covariance matrix the Gaussian is diagonal (meaning the components are independent). More-or-less, the math tells you that the optimal compressor for our problem should "drop" all components with a variance below some computable constant and to have your decompressor output "0" for these components. For the rest of the components, the optimal compressor allocates more bits for the components with higher variance.

Now assume you're trying to compress IID Gaussian random vectors with a general covariance matrix $\mathbf{K}$. Since the covariance is PSD, we can always write $\mathbf{K} = \mathbf{U}\mathbf{\Lambda}\mathbf{U}^{T}$ where $\mathbf{U}$ is an orthogonal matrix and $\mathbf{\Lambda}$ is diagonal. If $\mathbf{x}_t \sim \mathcal{N}(\mathbf{0},\mathbf{K})$ and $\mathbf{y}_t = \mathbf{U}^T\mathbf{x}_t$ then $\mathbf{y}_t\sim\mathcal{N}(\mathbf{0},\mathbf{\Lambda})$. So if our compressor first transforms the $\mathbf{x}_t$s into the basis where the covariance is diagonal (by multiplying by $\mathbf{U}^T$) , we can then focus on compressing the $\mathbf{y}_t$s, which is easier since they have a diagonal covariance matrix. If $\mathbf{\hat{y}}_t$ denotes the reconstructed $\mathbf{y}_t$, we can compute the reconstructed $\mathbf{x}_t$ via $\mathbf{\hat{x}} = \mathbf{U}\mathbf{\hat{y}}_t$.

Often, in compression the general technique of describing some signal in another spectral basis is essentially an attempt to make the problem look more like the problem of compressing Gaussian random vectors with a diagonal covariance matrix. In general we don't know the covariance matrix of our source, so finding a basis where the covariance is diagonal (or, if you like, sparse) is tough to do. For images, it turns out that things like the discrete cosine transform work well for this purpose. For different classes of images, different transforms might work worse or better.

Another (probably more important) reason spectral decompositions are used in compression is that the "mean squared error" distortion metric isn't that useful when dealing with things like multimedia signals. For me the audio example is easier to grasp- we can't hear above and below certain frequencies so just "deleting" the spectral content at those frequencies doesn't actually increase the perceptible "distortion". For images, I'd imagine this is even more complicated.

If you have any knowledge of information theory, you should check out chapter 10 in Elements of Information Theory by Cover and Thomas. If you've never done any info theory before, I guess you might be able to figure out the general idea of chapter 10 if you read chapters 2, 8, 9 and 10. The book "Digital Signal Compression" by Pearlman is probably quicker to the punch here.