Let $X$ be a random variable realized as the event $(X=x)$. The corresponding likelihood function is given by
$$\mathcal{L}_x:\Theta\rightarrow[0,1]$$ $$\theta\mapsto P(X=x|\theta)$$
for a space $\Theta$ of parameter configurations $\theta$.
In the literature, $\mathcal{L}_x(\theta)$ is sometimes written as $\mathcal{L}(\theta|X=x)$. I assume this is done to emphasize that the event $(X=x)$ is 'given'. However, this notation leads to confusion, since it suggests that $\mathcal{L}_x$ is a probability density ('conditioning' on the event $X=x$), which appears to not be true in general (cf. second answer in this thread on math.overflow).
So my questions are:
- Is $\mathcal{L}(\theta|X=x)$ just 'overloading' the notation $f(\cdot|\cdot)$, or is there some hidden meaning/analogy to conditional probability $P(\cdot|\cdot)$ which I am missing?
- Are there other areas in mathematics where $f(\cdot|\cdot)$ is used? Could you provide an example?
Currently, I think $\mathcal{L}(\theta|X=x)$ is a bad notational choice because it caused confusion for me when trying to understand the likelihood function. Especially since at any point $\theta$, one has $\mathcal{L}(\theta|X=x)=P(X=x|\theta)$


In classical (frequentist) statistics $\theta$ is unknown constant, thus there is no sense in viewing $L(\theta|X)$ in any probabilistic manner. As you can see in the linked thread, $L(\theta|X)$ does not even need to be integrated (w.r.t. $\theta$) to one. Hence, the more common notation is $L(\theta; X=x)$ or its shorthand $L(\theta; X)$ or $L(\theta; x)$, which just designates the fact that we view it as a function of $\theta$ over the parametric space $\Theta$, and regard the $X$ as constant $X=x$.
In the case you are Bayesian, then you usually denote the posterior distribution of $\theta$ by $f(\theta|...)$ or $p(\theta|...)$, not to confuse it with the classical likelihood function. But to this notation to make sense you must assume a prior distribution for $\theta$, $f(\theta)$. That is, from the very beginning you regard $\theta$ as a random variable and not as a constant.