I am trying to get a grip on importance sampling with Monte-Carlo integration.
I do understand importance sampling as a way to reduce variance by sampling according to a specific probability distribution, in which samples that contribute more to the result are sampled more frequently, but balanced by dividing by a larger pdf, as expressed in the following formula.
$$F_N = \frac{1}{N} \sum_{i=1}^{N} \frac{f(X_i)}{p(X_i)}$$
What I don't understand is the distinction that some papers are making between sample weights and PDF. More specifically, one paper that I am reading is talking about sampling a spherical direction $(\theta, \phi)$ in two steps. First the longitudinal direction $\theta$ is sampled using perfect importance sampling with sample weight 1 and a PDF of a certain different value (depending on $\theta$). Second $\phi$ is sampled with a sample weight $\le 1$ and a certain PDF.
Where does this sample weight come from, what is a sample weight in the context of importance sampling?
The paper I am talking about is: "Importance Sampling for Physically-Based Hair Fiber Models", by d'Eon et al. (2013).