Suppose that $K_1(x,y)$ and $K_2(x,y)$ are valid kernels prove that $K_1(x,y)K_2(x,y)$ is a valid kernel.

214 Views Asked by At

Suppose that $K_1(x,y)$ and $K_2(x,y)$ are valid kernels meaning that the can be expressed in the form of \begin{align*} K_1(x,y) = \phi_1(x)^\top \phi_1(y) &= \sum_{i = 1}^n\phi_1(x)_i \phi_1(y)_i\\ K_2(x,y) = \phi_2(x)^\top \phi_2(y) \end{align*} (source: Wikipedia: Kernel Method, paragraph "the kernel trick")

Prove that $K_1(x,y)K_2(x,y)$ is a valid kernel.

The following derivation is given: \begin{align*} K_1(x,y) = \phi_1(x)^\top \phi_1(y) &= \sum_{i = 1}^n\phi_1(x)_i \phi_1(y)_i\\ K_2(x,y) = \phi_2(x)^\top \phi_2(y) &= \sum_{j = 1}^m\phi_2(x)_j \phi_2(y)_j\\ K_1(x,y)K_2(x,y) &= \sum_{i = 1}^n\phi_1(x)_i \phi_1(y)_i\sum_{j = 1}^m\phi_2(x)_j \phi_2(y)_j\\ &= \sum_{i = 1}^n\sum_{j = 1}^m\big[\phi_1(x)_i \phi_1(y)_i\big]\big[\phi_2(x)_j \phi_2(y)_j\big]\\ &= \sum_{i = 1}^n\sum_{j = 1}^m\big[\phi_1(x)_i \phi_2(x)_j\big]\big[\phi_1(y)_i \phi_2(y)_j\big]\\ &= \sum_{k = 1}^{mn}\phi_{12}(x)_k \phi_{12}(y)_k\\ &= \phi_{12}(x)^\top \phi_{12}(y)\\ \end{align*}

Edit (This chunk is probably wrong but left as originally posted in order to be able to understand the comments... in case...)

However, I think there are 2 issues with this notation

  • to me, as $n = m$, we can replace $m$ by $n$
  • moreover, to me $$ \phi_{12}(x) = \sum_{i = 1}^n\phi_1(x)_i \phi_2(y)_i = \phi_1(x)^\top \phi_2(y) $$

hence the whole derivation should be written as \begin{align*} K_1(x,y) = \phi_1(x)^\top \phi_1(y) &= \sum_{i = 1}^n\phi_1(x)_i \phi_1(y)_i\\ K_2(x,y) = \phi_2(x)^\top \phi_2(y) &= \sum_{j = 1}^{\color{red}n}\phi_2(x)_j \phi_2(y)_j\\ K_1(x,y)K_2(x,y) &= \sum_{i = 1}^n\phi_1(x)_i \phi_1(y)_i\sum_{j = 1}^m\phi_2(x)_j \phi_2(y)_j\\ &= \sum_{i = 1}^n\sum_{j = 1}^{\color{red}n}\big[\phi_1(x)_i \phi_1(y)_i\big]\big[\phi_2(x)_j \phi_2(y)_j\big]\\ &= \sum_{i = 1}^n\sum_{j = 1}^{\color{red}n}\big[\phi_1(x)_i \phi_2(x)_j\big]\big[\phi_1(y)_i \phi_2(y)_j\big]\\ &= \sum_{k = 1}^{\color{blue}n}\phi_{12}(x)_k \phi_{12}(y)_k\\ &= \phi_{12}(x)^\top \phi_{12}(y)\\ \end{align*} What's correct?

Edit (and Answer, thanks to @GReyes)

where

  • the definition of $\phi_{12}$ is: \begin{align*} \phi_{12}(x)_k &= \phi_1(x)_i\phi_2(x)_j\\ \phi_{12}(x)&= \begin{bmatrix} \phi_{12}(x)_1, ..., \phi_{12}(x)_k \end{bmatrix} \end{align*} where \begin{align*} &\phi_{12}(x)_s = \phi_{1}(x)_i\phi_{2}(x)_j\\ &i \in \{1, ..., n\}\\ &j \in \{1, ..., m\}\\ &s \in \{1, ..., k\}\\ &k = nm \end{align*}

  • As $\phi_1$ and $\phi_2$ can map to different spaces $\mathbb{R}^n$ and $\mathbb{R}^m$. Hence $n=m$ and $n\ne m$ are both possible. Hence, $n\ne m$, $\phi_{12}$ itself cannot be expressed as a dot product