From page 9 of these course notes,
A function $k : \mathcal{X} \times \mathcal{X} \mapsto \mathbb{R}$ is a kernel if
$k$ is symmetric: $k(x,y) = k(y,x)$
$k$ gives rise to a positive semi-definite "Gram matrix," i.e., for any $m \in \mathbb{N}$ and any $x_1,\dots,x_m$ chosen from $\mathcal{X}$, the Gram matrix $\mathbf{K}$ defined by $K_{ij} = k(x_i,x_j)$ is positive semi-definite.
From this definition, it seems that a kernel can be thought of as a "continuous" matrix $\mathbf{A}$ that consists of infinitely many rows and columns, where $k(x,y)$ is one element of $\mathbf{A}$. Additionally, $\mathbf{A}$ is symmetric and positive semi-definite. Furthermore, in page 14 of the same course notes, the author writes
Mercer's Theorem
The inspiration of the name "kernel" comes from the study of integral operators, studied by Hilbert and others. Function $k$ which gives rise to an operator $T_k$ via: $$ (T_k f)(x) = \int_\mathcal{X} k(x,x') f(x') \ \text{d}x' $$ is called the kernel of $T_k$.
The expression for $(T_k f)(x)$ looks like an infinite matrix-vector product, where the infinitely many columns of $\mathbf{A}$ are linearly combined using the elements of the vector $f$.
Is this way of thinking about a kernel correct?