Is there any relationship between the two different kernels, since they are both called 'kernel'?
Given a kernel $k:\mathbb{R}^d\times\mathbb{R}^d\to\mathbb{R}$, the MMD distance between two distribution is $$MMD^2(P,Q):=\mathbb{E}_{X,X'\sim P,Y,Y'\sim Q}k(X,X')+k(Y,Y')-2k(X,Y)$$ which is also $\|\mu_P-\mu_Q\|_H^2$, where $\mu$ denotes the mean embedding and $H$ is the RKHS corresponds to $k$.
As for the kernel method, we usually mean to find a classifier. Given a dataset $\{(X_i,y_i)\}_i$, let $\mathbb{K}_{i,j}=k(X_i,X_j)$ be the gram matrix, then the regression function is $f(x)=k(x,X)\mathbb{K}^{-1}y$, wherer $[k(x,X)]_i:=k(x,X_i)$, $y=(y_1,y_2,...)$ and here $\mathbb{K}^{-1}$ is the pseudo inverse.