Problem with statistics notation for a density function

58 Views Asked by At

I'm reading a paper about partitioning of driving data and producing synthetic driiving profiles and I'm uncapable of understanding some of its equations.

Just to give an example, if we consider the length of a speed-time trace as $n$ and each of the modal events (i.e., acceleration/deceleration) contained within $n$ as $y_i$, $i = 1, ..., n$. Let $G$ denote the number of clusters and for each cluster $g = 1, ..., G$, there is a set of associated parameters, $\theta_g$, which can be any parameters of inter- est such as the mean and variance of the observation (i.e., acceleration/deceleration). The they define the density function of an observation $y_i$, $f(y_i)$ can be writen as: $$ f\left( {{y_i}|\theta } \right) = \sum\limits_{g = 1}^G {{\pi _g}f} \left( {{y_i}|{\theta _g}} \right) $$

where $\pi_g$ is the probability of $y_i$ being in cluster $g$ and $\sum\limits_{g = 1}^G {\pi _g}=1$.

My problem there is that I don't get what they're willing to say with the ${{\pi _g}f} \left( {{y_i}|{\theta _g}} \right)$ part. Is it a simple multiplication? Or do they mean that the probability $\pi_g$ follows the density function of $y_i$ with the subset of parameters $\theta_g$ or something like that?

I would be very thankful if someone could shed some light on my silly notation doubts. ^^

Just in case it is needed, the paper is: [1] J. Lin and D. A. Niemeier, “Estimating Regional Air Quality Vehicle Emission Inventories : Constructing Robust Driving Cycles,” Transp. Sci., pp. 330–346, 2003.