The mutual information is defined on random variables. That is, $I(X;Y)$ denotes the mutual information between random variables $X$ and $Y$.
On the other hand, the the Kullback-Leibler divergence is defined on probability distributions; i.e., $D_{\mathrm{KL}}(P\|Q)$ denotes the K-L divergence of the probability distribution $P$ with respect to the probability distribution $Q$.
Why aren't $I$ and $D_{\mathrm{KL}}$ both defined on random variables (or both defined on probability distributions)?
I think it is actually natural to use different notation so in this case. Mutual information is defined on the joint of $X$ and $Y$, and if written as a function of the joint as $I(p(X,Y))$, there could be confusion in higher dimensions. Suppose $X = [x_1, \ldots, x_n]$ and $Y = [x_{n+1}, \ldots, x_{n+m}]$. Then writing $I(p(x_1, \ldots, x_{n+m}))$ would be ambiguous without specifying which variables form $X$ and which form $Y$.
On the other hand, Kullback-Leibler divergence is defined on two marginal distributions of $X$ and $Y$, and the joint must be ignored (or assumed independent). Hence writing $D_{KL}(X,Y)$ would be confusing if they are indeed dependent. In this case writing the marginal distributions directly makes more sense.
And of course, you know that mutual information is just the Kullback-Leibler divergence between the product of marginals and the joint distribution.