Recently I asked a question and used the shorthand $P(A)$ for a random variable $A$ to quickly reason about conditional probabilities. However, I was informed that this must be written $P(A=a)$ for random variables and the notational shorthand does not make sense, and is only for events. Is this true?
In particular, I am interested in Bayesian Networks and want to write a factorisation such as $$P(A,B,C) = P(A)P(B\mid A)P(C\mid A,B) = P(A)P(B)P(C\mid A,B)$$ to say that the structure of the Bayesian Network is a graph with two root nodes corresponding to $A$ and $B$ which are not dependent on any other random variable and a third node corresponding to $C$ which is dependent on $A$ and $B$ (represented by directed edges).
Must I instead write
$$P(A=a,B=b,C=c) = P(A=a)P(B=b\mid A=a)P(C=c\mid A=a,B=b) = P(A=a)P(B=b)P(C=c\mid A=a,B=b)$$
to be mathematically coherent? Would I need to say this is for each $a,b,c$?
Edit: An example I am particularly interested in is on page 3 of this document, or equivalently the wikipedia entry for the chain rule of probability. Is it valid to write the chain rule of probability like this?
Edit: I am particularly confused as so much of the literature of Bayesian Networks seems to use this shorthand, such as this, this, this, this, this, and this.
Well, technically, no, $P(A)$ is not the same thing as $P(A=a)$. However, however, it is convenient to use a shorthand notation.
The expansions can get cumbersome, so reducing clutter is quite useful. It becomes easier to follow the logic, and reducing typesetting reduces typographical erorrs.
So long as it is clear what the abbreviation represents, you may use it.