all, I have been learning kernel method for a long time. But I am still not very sure how it works. In my opinion, it works as follows: say $f(x) = \sum_i\alpha_ik(x_i, x)$.
First we need to decide which kernel we should use. The common one is the RBF kernel.
Then we compute all $\alpha_i$. This can be computed using the training data. For this step, I will compute the kernel matrix based on the training dataset and then compute $\alpha_i$.
In the testing part, for each testing point $t$, we need to compute $k(x_i, t)$ where $x_i$ is the $i^{th}$ training data. Then using computed $\alpha_i$, we get $f(t)$ for the testing data $t$.`
If I were wrong, please give me some comments.
Your overall procedure can be correct (depending on the perspective from which we look at the problem), but many details are missing.
First of all, you should understand why we use kernels: kernels implicitly project the data points onto a higher-dimensional space where either a (linear) hyperplane can separate the different classes, or we are provided with richer features for regression.
One good starting point to understand the kernels is the general class of kernel SVMs. There are many good tutorials that explain the details of kernel SVMs, and how they are derived from the dual optimization program of primal SVM formulation. (here is one: http://www.robots.ox.ac.uk/~az/lectures/ml/lect3.pdf)
The main detail that you are missing is how we learn the $\alpha_i$'s. In fact when you generate the kernel matrix and form the optimization problem, you can use any quadratic solver to learn the $\alpha_i$'s. Currently most of state-of-the-art solvers use the SMO (Sequential Minimal Optimization) algorithm to learn $\alpha$'s.
The LibSVM library is a great tool that you can use for kernel SVMs (especially if you don't want to implement SMO yourself)
I hope it helps.