This question is about how to make sense of a probabilistic model I'm reading about:
You have three random variable $A$, $B$ and $C$ that are all real-valued. You want to model an input-output relationship with these random variables, by assuming the following relationships between them:
It is assumed that $A$ is the constant random variable that only takes the value $a\in \mathbb
{R}$, where $a$ is thought of as input.
Furthermore, $P(B=b | A=a)=\mathcal{N}(b;3\cdot e^a,1)$ and $P(C=c | B=b)=\mathcal{N}(c;2\cdot b^2,1)$, where $\mathcal{N}(x;\mu,1)$ denotes the probability density of a normal distribution with mean $\mu$ and variance $1$.
My questions are:
Does an equation like $P(B=b | A=a)=\mathcal{N}(b;3\cdot e^a,1)$ even make sense?
If we takes samples, on the left we have a number between $0$ and $1$ but on the right we have a number that can be larger, as the pdf does not need to stay within $[0,1]$, right?What is the output?
In the (not online available) text I'm reading, the above is all that is mentioned. I would assume that to arrive at the output, one needs to proceed in the following way:
Given $a$, draw a number $b$ from a normally distribution with mean $3\cdot e^a$ and unit variance and then similarly draw $c$, the output, from a normally distribution with mean with mean $2\cdot b^2$ and unit variance. But I'm not sure if that's correct; also, it is not clear to me, in case this interpretation is indeed correct, why conditional probabilities in the model formulation are needed.
Firstly, the following
does not really make sense. If something is assumed to be constant, then there is no need to model it as a random variable. Moreover if $A$ is going to model the input, it must vary from time to time. So let us assume that this is rather poorly written - $A$ is a random variable with some (maybe unknown) non-constant distribution. Maybe it should be discrete, or taking only two values? It might make sense.
Nextly, we have
This is also weirdly written, and makes little sense. You have written
You are write, we should not use $\mathbb{P}$ sign when the probability density function is really what we are referring to.
What the author probably wanted to say, was that conditionally on $A$, the distribution of $B$ is normal with given ($A$ dependent) parameters. Later on, after $(A, B$) have been sampled, the distribution of $C$ is also normal with the assigned ($B, A$ dependent) parameters.
The above is only my educated guess, as in the cited text it is written
$$\mathbb{P}(B=b|A=a)=\mathcal{N}(b;3⋅e^a,1),$$ \begin{equation} \mathbb{P}(C=c|B=b)=\mathcal{N}(c;2⋅b^2,1), \tag{1} \end{equation} and I assume that it should have been $$\mathbb{P}(B=b|A=a)=\mathcal{N}(b;3⋅e^a,1),$$ \begin{equation}\mathbb{P}(C=c|B=b, A=a)=\mathcal{N}(c;2⋅b^2,1). \tag{2} \end{equation} My guess is based on the fact, that as You say, we are actually going to model some process. As long as $(1)$ might be valuable piece of information, it does not define a unique model.
Moreover, this should be written in a more clear way though. For example:
$$B|A \sim \mathcal{N}(3\cdot e^A,1),$$ and $$C|B,A \sim \mathcal{N} (2\cdot B^2,1).$$
Note that this is exactly what You have written in Your interpretation
I think it is OK. Saying it once again in different words:
Such models - known as hierarchical models - are very important in modern statistics. Basically, Bayesian Statistics is devoted almost entirely to the study of such models.