Probabilistic modelling

223 Views Asked by At

This question is about how to make sense of a probabilistic model I'm reading about:

You have three random variable $A$, $B$ and $C$ that are all real-valued. You want to model an input-output relationship with these random variables, by assuming the following relationships between them:

It is assumed that $A$ is the constant random variable that only takes the value $a\in \mathbb {R}$, where $a$ is thought of as input.
Furthermore, $P(B=b | A=a)=\mathcal{N}(b;3\cdot e^a,1)$ and $P(C=c | B=b)=\mathcal{N}(c;2\cdot b^2,1)$, where $\mathcal{N}(x;\mu,1)$ denotes the probability density of a normal distribution with mean $\mu$ and variance $1$.

My questions are:

  1. Does an equation like $P(B=b | A=a)=\mathcal{N}(b;3\cdot e^a,1)$ even make sense?
    If we takes samples, on the left we have a number between $0$ and $1$ but on the right we have a number that can be larger, as the pdf does not need to stay within $[0,1]$, right?

  2. What is the output?
    In the (not online available) text I'm reading, the above is all that is mentioned. I would assume that to arrive at the output, one needs to proceed in the following way:
    Given $a$, draw a number $b$ from a normally distribution with mean $3\cdot e^a$ and unit variance and then similarly draw $c$, the output, from a normally distribution with mean with mean $2\cdot b^2$ and unit variance. But I'm not sure if that's correct; also, it is not clear to me, in case this interpretation is indeed correct, why conditional probabilities in the model formulation are needed.

2

There are 2 best solutions below

8
On BEST ANSWER

Firstly, the following

It is assumed that $A$ is the constant random variable that only takes the value $a\in \mathbb{R}$, where $a$ is thought of as input.

does not really make sense. If something is assumed to be constant, then there is no need to model it as a random variable. Moreover if $A$ is going to model the input, it must vary from time to time. So let us assume that this is rather poorly written - $A$ is a random variable with some (maybe unknown) non-constant distribution. Maybe it should be discrete, or taking only two values? It might make sense.

Nextly, we have

Furthermore, $\mathbb{P}(B=b|A=a)=\mathcal{N}(b;3⋅e^a,1)$ and $\mathbb{P}(C=c|B=b)=\mathcal{N}(c;2⋅b^2,1)$, where $\mathcal{N}(x;\mu,1)$ denotes the probability density of a normal distribution with mean $\mu$ and variance $1$.

This is also weirdly written, and makes little sense. You have written

...on the left we have a number between 0 and 1 but on the right we have a number that can be larger...

You are write, we should not use $\mathbb{P}$ sign when the probability density function is really what we are referring to.

What the author probably wanted to say, was that conditionally on $A$, the distribution of $B$ is normal with given ($A$ dependent) parameters. Later on, after $(A, B$) have been sampled, the distribution of $C$ is also normal with the assigned ($B, A$ dependent) parameters.

The above is only my educated guess, as in the cited text it is written

$$\mathbb{P}(B=b|A=a)=\mathcal{N}(b;3⋅e^a,1),$$ \begin{equation} \mathbb{P}(C=c|B=b)=\mathcal{N}(c;2⋅b^2,1), \tag{1} \end{equation} and I assume that it should have been $$\mathbb{P}(B=b|A=a)=\mathcal{N}(b;3⋅e^a,1),$$ \begin{equation}\mathbb{P}(C=c|B=b, A=a)=\mathcal{N}(c;2⋅b^2,1). \tag{2} \end{equation} My guess is based on the fact, that as You say, we are actually going to model some process. As long as $(1)$ might be valuable piece of information, it does not define a unique model.

Moreover, this should be written in a more clear way though. For example:

$$B|A \sim \mathcal{N}(3\cdot e^A,1),$$ and $$C|B,A \sim \mathcal{N} (2\cdot B^2,1).$$

Note that this is exactly what You have written in Your interpretation

Given $a$, draw a number $b$ from a normally distribution with mean $3\cdot e^a$ and unit variance and then similarly draw $c$, the output, from a normally distribution with mean with mean $2\cdot b^2$ and unit variance.

I think it is OK. Saying it once again in different words:

  1. we start by sampling $A$,
  2. given $A$, we sample $B$ from $\mathcal{N}(3\cdot e^A,1)$,
  3. given $B, A$, we sample $C$ from $\mathcal{N} (2\cdot B^2,1).$

Such models - known as hierarchical models - are very important in modern statistics. Basically, Bayesian Statistics is devoted almost entirely to the study of such models.

0
On

I think the question is almost correct except for a few notational issues

The first:

Note $B|A \sim \mathcal{N}(3e^a,1)$. Therefore $\mathbf{P}(B=b|A) = 0$ This follows from a basic definition of continous random variables. What would be true is the density function that is $f_{B|A}(b) = 1/\sqrt{2\pi}e^{\frac{-(b - 3e^a)^2}{2}}$

The second the output is ofcourse C The output is a probabilistic output that is it's a random variable whose density function conditioned on the input is $f_{C|A}(c) = \int_\mathbf{R}f_{C|B(x)}f_{B(x)|A}dx$ $ = \int_\mathbf{R}1/\sqrt{2\pi}e^{\frac{-(c - 2x^2)^2}{2}}1/\sqrt{2\pi}e^{\frac{-(x - 3e^a)^2}{2}}dx$ $= 1/2\pi \int_\mathbf{R}e^{\frac{-(c - 2x^2)^2 -(x - 3e^a)^2}{2}}dx$

EDIT Since you mentioned about the credibility, I really dont know what to say. Do we have to show our academic transcript or something?