I'm studying on generative adversarial networks and I've come across the following formula for inception score :
$$\text{IS}(G) \approx \exp \left(\frac{1}{N}\sum_{i=1}^ND_{KL} \left(p \left(y\vert x^{(i)} \Vert \hat{p} \left(y \right) \right) \right) \right).$$
I just want to know what does those $\Vert$ (Double vertical bars) mean in that formula?
2026-04-17 17:47:06.1776448026
On
What does double vertical bars notation mean in probability?
2.6k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
2
There are 2 best solutions below
0
On
I don't know what just bumped this question to front page, but the correct answer to this question is given in the comments; namely, there is a paranthesis missing in the original question, and the double bars are traditionally used to indicate the two inputs to KL divergence (presumably so that you can put conditional distributions in there.)
I do not know the details of the adversarial networks however I can offer a general answer for probability theory which might be close to the answer.
In a measure-theoretic setting $P(A||\mathscr{G})$ is sometimes written to denote the conditional probability of the event $A$ with respect to the $\sigma$-field $\mathscr{G}$ where $P$ is a probability measure on the measurable space $(\Omega,\mathscr{F})$ where $\mathscr{F}$ is a larger $\sigma$-field satisfying $\mathscr{G}\subseteq\mathscr{F}$. Random variables $Y$ and $X$ can generate such a $\sigma$-fields, say $X$ generates $\mathscr{G}$ and $Y$ generates $\mathscr{F}$, then $P(A||\mathscr{G})=P(Y\in A||X)$. The specific relationship satisfied is
$$\int_{G}P(Y\in A||X)dP=P(\{Y\in A\}\cap\{X\in G\}) \hspace{10pt}\text{for all}\hspace{10pt} G\in\mathscr{G}\hspace{10pt}(1)$$
The $\hat{p}(y)$ in your equation probably (I am guessing here) denotes an estimate using a sample of random data $Y$ observed at $Y=y$. This estimate $\hat{p}(y)$ will be a random variable so perhaps all of the above will apply and the $||$ notation simply hints at the measure-theoretic machinery I allude to.
In the special case where
$$\int_{G}P(Y\in A||X)dP=P(Y\in A||X)\int_{G}dP\\=P(Y\in A||X)P(X\in G)$$
then the above equation reduces to
$$P(Y\in A||X)dP=P(\{Y\in A\}\cap\{X\in G\})/P(X\in G)\\=P([\{Y\in A\}\cap\{X\in G\}]|X\in G)$$
using the traditional $|$ notation signifying the $P(A,B)/P(B)=P(A|B)$ definition. In general the two definitions do not coincide - I believe $\mathscr{G}$ being generated by a countable class $\mathcal{A}$ might be a sufficient condition, that is $\mathscr{G}=\sigma(\mathcal{A})$ where $|\mathcal{A}|=\aleph_{0}$.