I am learning about the cross entropy, defined by Wikipedia as $$H(P,Q)=-\text{E}_P[\log Q]$$ for distributions $P,Q$.
I'm not happy with that notation, because it implies symmetry, $H(X,Y)$ is often used for the joint entropy and lastly, I want to use a notation which is consistent with the notation for entropy: $$H(X)=-\text{E}_P[\log P(X)]$$
When dealing with multiple distributions, I like to write $H_P(X)$ so it's clear with respect to which distribution I'm taking the entropy. When dealing with multiple random variables, it thinks it's sensible to make precise the random variable with respect to which the expectation is taken by using the subscript $_{X\sim P}$. My notation for entropy thus becomes $$H_{X\sim P}(X)=-\text{E}_{X\sim P}[\log P(X)]$$
Now comes the point I don't understand about the definition of cross entropy: Why doesn't it reference a random variable $X$? Applying analogous reasoning as above, I would assume that cross entropy has the form \begin{equation}H_{X\sim P}(Q(X))=-\text{E}_{X\sim P}[\log Q(X)]\tag{1}\end{equation} however, Wikipedia makes no mention of any such random variable $X$ in the article on cross entropy. It speaks of
the cross-entropy between two probability distributions $p$ and $q$
which, like the notation $H(P,Q)$, implies a function whose argument is a pair of distributions, whereas entropy $H(X)$ is said to be a function of a random variable. In any case, to take an expected value I need a (function of) a random variable, which $P$ and $Q$ are not.
Comparing the definitions for the discrete case: $$H(p,q)=-\sum_{x\in\mathcal{X}}p(x)\log q(x)$$ and $$H(X)=-\sum_{i=1}^n P(x_i)\log P(x_i)$$
where $\mathcal{X}$ is the support of $P$ and $Q$, there would only be a qualitative difference if the events $x_i$ didn't cover the whole support (though I could just choose an $X$ which does).
My questions boil down to the following:
Where is the random variable necessary to take the expected value which is used to define the cross entropy $H(P,Q)=-\text{E}_{P}[\log Q]$
If I am correct in my assumption that one needs to choose a random variable $X$ to compute the cross entropy, is the notation I used for (1) free of ambiguities.
Your notation $H(X)=-\text{E}_P[\log P(X)]$ is really redundant. In general, if $X$ is a random variable, and $g$ is any function, then $E[g(X)]$ is defined without ambiguity, it's not necessary (actually is makes no sense) to specify "with respect to which variable the expectation is taken".
The confusion might arise here because we are dealing with two things, random variables and distributions, and using different letters for them. But, in essence, a random variable is a distribution.
If we understand that $X$ is a rv with distribution $p$ , and the same for $Y$ and $q$, then we can write, unambiguosly (let me use $\tilde H$ for the cross entropy to distinguish it from the joint entropy):
$$\tilde H(p,q) = - E[ \log q(X)] = - \sum_x p(x) \log q(x) $$
If the above causes some confusion, consider that $\log (q(\cdot))$ is just a function, like $\sin(\cdot)$ or $\sqrt{\cdot}$
It mightbe more clear and consistent (but also more verbose) to use $P_X()$ and $P_Y()$ to denote the distributions of random variables $X,Y$.
$$\tilde H(X,Y) = - E[ \log P_Y(X)] = - \sum_x P_X(x) \log P_Y(x) $$
Notice that the lowercase $x$, used in the sums, is a dummy variable (not a random variable!) and we could also use $u$ instead or any other letter.
Edit: An attempt to clarify.
First: if $X$ is a given random variable, then its probability distribution is given, and the expression $E[X]$ is perfectly well defined, with zero ambiguity: eg, in the discrete case, $E[X] = \sum_x x P(x) $ where $P(\cdot)$ is the pmf of $X$. Period. It would be nonsensical to specify "with respect with which variable or distribution" the expectation should be taken.
This is also true when some funcion is applied to the rv (which just produces another rv), as in $E[g(X)]$, or when the variable is multidimensional, as in $E[g(X,Y)]$.
Hence, the notations $E_Q[X]$, $E_Q[g(X)]$ , $E_Q[g(X,Y)]$ (where $Q$ is some distribution) are wrong, they make no sense.
Having said that: sometimes it's not feasible or practical to stick with that notation, where each random variable has a letter like $X$. For example, suppose we want consider the family of zero mean gaussian distributions, parametrized by the deviation $\sigma$, and we want to denote by $r(\sigma)$ the respective differential entropy. Letting $\phi(x;\sigma)$ be the gaussian density, we'd write
$$r(\sigma) = - \int \phi(x;\sigma) \log \phi(x;\sigma) dx$$
Notice that here $x$ is not a random variable, is just a dummy integration variable, and we could replace it with any letter. Now, surely, this is an expectation, and so might want to write something like
$$r(\sigma) = - E[\log \phi(x;\sigma)] \tag2$$
... but this is not right, because the argument of $E[\cdot]$ is not a random variable! We might instead write
$$r(\sigma) = - E[\log \phi(X_{\sigma};\sigma)] \tag3$$
adding the definition: "$X_{\sigma}$ is a rv that follows the distribution $N(0,\sigma^2)$", but this is rather ugly. Hence we often abuse the notation in the following way:
$$r(\sigma) = - E_{\phi(x;\sigma)}[\log \phi(x;\sigma)] \tag4$$
But, be careful, here the $E[]$ notation must be understood not in the probabilistic setting but mere as a functional operator; here random variables are not involved, only functions (some of them, densities or distributions). That is
$$E_{f(x)}[g(x)] \triangleq \int f(x) g(x) dx \tag 5 $$
In this setting, the original notation $H(P,Q)=-\text{E}_P[\log Q]$ is, indeed, totally correct.
And as for the plain entropy, it should either be
$$H(X) = -E [\log P(X)] \tag 6$$
(entropy of a random variable $X$ is a rv which has distribution $P$) or
$$H(P) = -E_P [\log P] \tag 7$$
(entropy of a distribution $P$, functional expection operator, no random variables appear). You are mixing both.