Given $F=[f_1;f_2;...;f_d]\in \Bbb R^{d \times n}$, now define a mutual information matrix $G \in \Bbb R^{d \times d}$, whose arbitrary element $G_{ij}=I(f_i, f_j)$.
From section 3.3 of Multi-label Feature Selection via Global Relevance and Redundancy Optimization, author mentions that $G$ is positive semidefinite, but why?