Is normalized RBF always better than RBF

262 Views Asked by At

The question is as the title. Mathematically, I want to know does the following inequation always hold for any vector $\mathbf b$?

$\mathbf b^T \mathbf B \mathbf B^+ \mathbf b \, \ge \, \mathbf b^T \mathbf H \mathbf H^+ \mathbf b$

Where $\mathbf H$ is a N x M sized matrix (M < N), its element $h_{i j} = exp(-\frac{\|\mathbf x_i - \mathbf c_j\|^2}{2 \sigma^2})$, $x_i, c_j \in R^n$. Matrix $\mathbf B$ is row normalized matrix of $\mathbf H$, i.e. $b_{i j} = \frac{h_{i j}}{\sum_j h_{i j}}$. $\mathbf B^+, \mathbf H^+$ is the Moore-Penrose pseudoinverse for $\mathbf B, \mathbf H$ respectively, and size of vector $\mathbf b$ is $N$.

This question arises in the study of normalized RBF, where $\mathbf H$ denotes the output matrix for standard RBF, and $\mathbf B$ denotes the output matrix for normalized RBF (NRBF). The wiki page gives a justification that the NRBF makes sense. However, there is no proof neither in the wiki page nor the original paper that NRBF is always better than RBF. The original paper is: "Normalized Gaussian Radial Basis Function Networks".

Denote the observed output on training samples as $\mathbf b$, the best estimation made by RBF and NRBF is $\mathbf H \mathbf H^+ \mathbf b$ and $\mathbf B \mathbf B^+ \mathbf b$. Thus we want the squared error $(\mathbf H \mathbf H^+ \mathbf b)^T \mathbf H \mathbf H^+ \mathbf b$ and $(\mathbf B \mathbf B^+ \mathbf b)^T \mathbf B \mathbf B^+ \mathbf b$ be as close to zero as possible, which can be deduced to the question listed at the top.

I now know the inequation does not always hold as some counter examples are found. But it's better to have an analytic proof. And it's desired to know when RBF is better than normalized RBF and visa versa.