Conceptual Understanding of Kernels

219 Views Asked by At

In the previous thread (Difference between kernel and function?) the question of the difference between a kernel and a function came to, in my mind, an unclear conclusion.

Am I right in thinking that a kernel is the property of certain functions to map from one space to another? Or am I grossly missing the point? My current professors tend to throw the term around, but I've never clearly understood their meaning.

If someone has an explanation to help my understanding I would greatly appreciate the help.

-Drew

3

There are 3 best solutions below

2
On BEST ANSWER

In mathematics the word kernel is used for two completely different purposes:

In algebra: Given a homomorphism $f:\>G\to H$ between groups the kernel of $f$ is the set of all $x\in G$ which map to the unit element $e\in H$. In particular, if $f:\> V\to W$ is a linear map between vector spaces, the kernel of $f$ is the subspace $K\subset V$ consisting of the vectors $x\in V$ which are mapped to $0\in W$.

In analysis or mathematical physics we often deal with situations of the following kind: Given is a function $$k:\quad \>A\times B\to{\mathbb R},\qquad (x,y)\mapsto k(x,y)\ ,$$ defined on some cartesian product space $A\times B$. One uses integration over $A$ to obtain for any function $$f:\quad A\to{\mathbb R},\qquad x\mapsto f(x)$$ a new function $f^T$ defined on $B$ in the following way: $$f^T(y):=\int_A k(x,y) f(x)\ dx\qquad(y\in B)\ .\tag{1}$$ When the function $k$ is used in this way it is called the kernel for this transformation. Examples are the kernel $k(t,s):=e^{-st}$ used in the Laplace transform or kernels of the form $k(x,y):={1\over|x-y|^\alpha}$ occurring in differential geometry or potential theory. A last example: When the first factor $g$ in a convolution $g*f$ is fixed once and for all and only the second factor $f$ is considered "variable" then the function $k(t,x):=g(x-t)$ can be considered as a "kernel" in the above sense.

The function $k(\cdot,\cdot)$ in these examples works like a "continuous version" of a matrix: Given an $m\times n$-matrix $[t_{ik}]$ we obtain for any input vector $x=(x_1,\ldots x_n)$ (which can be thought of as a function on $[n]$) an output vector $x^T=(y_1,\ldots,y_m)$ (a function on $[m]$) by means of the following formula, which is completely analogous to $(1)$: $$y_i=\sum_{k=1}^n t_{ik}\>x_k\qquad(1\leq i\leq m)\ .$$

3
On

The word kernel has several meanings in mathematics. This was already remarked in the different answers to the question Difference between kernel and function?. If you say the conclusion was unclear, you should clarify which kernel you mean.
For me, a conceptual understanding of the kernel is the categorial kernel: The kernel of a morphism $f\colon X\rightarrow Y$ is indeed a morphism, namely the "most general" morphism $k\colon K\rightarrow X$ that yields zero when composed with $f$ (i.e., with $f\circ k=0$).

0
On

Another field where the kernels are used is in optimization theory (or in machine learning). There, kernels are functions defined on some $D\times D$ set, where typically $D$ is required to be merely a non-empty set. Kernels are usually used as dissimilarity measures. We use them when mapping from some vector space (say $n$-dimensional) into another, often richer $d$-dimensional space, is needed. $d$ is mostly greater than $n$, typically $d\gg n$, but it could also be infinite-dimensional. Kernels need to be positive-definite, and this way they reproduce some Hilbert Space uniquely. For instance, if $D=\Bbb{R}^n$, then we can define the Radial Basis Function (RBF) as $k\colon\Bbb{R}^n\times\Bbb{R}^n\to\Bbb{R}$, with $$ k(x,x')=\exp(-\gamma\|x-x'\|^2), $$ where $\gamma$ is some positive parameter. This function measures the similarity of $x,x'$, but essentially represents the inner product $<\phi(x),\phi(x')>$, where $\phi$ mapps $\Bbb{R}^n$ to an infinite-dimensional space. The trick is that, whenever a dot product appears (i.e., $x\cdot x'$) we can replace it by $<\phi(x),\phi(x')>$, and thus by $k(x,x')$ without even knowing explicitely the mapping $\phi$.