Consider the following quote from the text An Introduction to Statistical Learning:
In practice, we often encounter data sets that contain many more than two variables. In this case, we cannot easily plot the observations. For instance, if there are p variables in our data set, then p(p − 1)/2 distinct scatterplots can be made, and visual inspection is simply not a viable way to identify clusters.
What exactly do the authors mean by the fact that $$\frac{p(p-1)}{2}$$ distinct scatterplots can be made? The quote is not referring to any specific data or any specific example, so this is the only context.
I understand that this question would be a better post for the Cross Validated Stack Exchange; however, this site is more popular and more active, so I thought I would post it here. Nevertheless, it is still math.
Thanks in advance!
$p(p-1)/2$ is the number of ways you can choose two features (without respect to order) from $p$ variables.
So if you had four variables $(a,b,c,d)$ you could have $(4 \times 3)/2 = 6$ scatterplots: $a-b$, $a-c$, $a-d$, $b-c$, $b-d$ and $c-d$. Choosing the variables in the reverse order gives the same scatterplot: $a-c$ is the same as $c-a$.