Sampling to reduce entropy

57 Views Asked by Bumbble Comm At 02 Apr 2026 - 10:06

Assume I have $N$ examples of datapoints with $d$ features each and I want to sample $n$ times from $N$ through a sampling function $s$, where $n \ll N$ and $d$ is the same for each datapoint.

Further assume that $s$ selects datapoints such that each new sample must have the maximum distance to the previous sample in feature space. If my goal is to maximally reduce entropy, i.e. learn the most about $N$ by sampling $n$ times, is the sampling function $s$ the best way to guarantee that? Or are there exceptions when choosing the sample that has the maximum distance to previous one in feature space might not be a good idea?

If my question is not well-defined, I appreciate any suggestion for corrections.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 04 Dec 2022 - 3:52

Described as you did, the sampling function can end up being deterministic. For example, suppose your $N$ samples are taken as a subspace of $\{0, 1\}^d$, containing the vectors $u=(0, 0, \dots, 0)^T$ and $v=(1, 1, \dots, 1)^T$. Then, if your first sample is $u$, and the distance is the Hamming distance, your function $s$ will sample $u, v, u, v, \dots$, which is deterministic.

Also, I am not exactly sure what you are trying to do. Are you sampling? If not, what are you trying to learn from the $N$ points? If you have no prior information about the $N$ elements, sampling uniformly at random might be the best thing, but I am not entirely sure it is your goal.

Sampling to reduce entropy

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in PROBABILITY-THEORY

Related Questions in ENTROPY

Trending Questions

Popular # Hahtags

Popular Questions