Weighting functions in the local polar system of coordinates.

147 Views Asked by At

While I was reading the paper "Geometric deep learning on graphs and manifolds using mixture model CNNs", I didn't understand the figure of "patch operator weighting functions". Can someone explain me clearly how these red curves relates to graphs and manifolds? Thank you.here's the figure

1

There are 1 best solutions below

1
On BEST ANSWER

Any differentiable manifold is locally homeomorphic to Euclidean space. In other words, if we select a point on the manifold, then over very small distances the manifold can be approximated by Euclidean space. It is then possible to parameterise the manifold with local polar coordinates $(\rho,\theta)$ which behave like polar coordinates in an infinitesimal region around the selected point.

The models GCNN, ACNN and MonNet each use a differentiable manifold parameterised by local polar coordinates. They have a weighting function, called the patch operator weighting function $w_i(\rho,\theta)$. Table $1$ in the paper gives $w_i(\rho,\theta)$ for ACNN and GCNN.

The red curves are $0.5$ level sets. That is to say, $w_i(\rho,\theta)=0.5$ along the red curves.


Edit: The OP asked about the definition of MoNet

In section 4, the paper mentions using a weighting function of the form $w_j({\bf{u}})=\exp\left(-\frac{1}2(\bf{\mu}-\bf{\mu}_j)^T\bf{\Sigma}_j(\bf{\mu}-\bf{\mu}_j)\right)$ with $\bf{\Sigma}_j$ and $\bf{\mu}_j$ learnable (formula 11 in the paper). $\bf{\Sigma}_j$ is restricted to being a diagonal matrix.

The paper then describes the neural network used to learn $\bf{\Sigma}_j$ and $\bf{\mu}_j$ and the procedure used to train it. The Adam method is explained by the following paper: https://arxiv.org/abs/1412.6980

LeNet used 2×2 max-pooling; in ChebNet and MoNet we used three convolutional layers, interleaved with pooling layers based on the Graclus method [16] to coarsen the graph by a factor of four.

For MoNet, we used polar coordinates u = (ρ,θ) of pixels (respectively, of superpixel barycenters) to produce the patch operator; as the weighting functions of the patch operator, 25 Gaussian kernels (initialized with random means and variances) were used. Training was done with 350K iterations of Adam method [25], initial learning rate 10−4, regularization factor 10−4, dropout probability 0.5, and batch size of 10.