I know that you can approximate the normal distribution with the binomial distribution.
One sample would be to flip $N$ coins, count how many heads, and increment the count in a bucket of a histogram for that many heads.
If you do enough samples, and as $N$ approaches infinity, you'll get a normal distribution.
My question is: What happens where instead of using a 2 sided coin, you use an $M$ sided die?
Does this still approach a normal distribution? Or is it a different type of distribution?
Edit: From observation it seems like adding more sides to the dice ($M$ increases) that you get a taller "triangular" distribution, but if you add more rolls ($N$ increases) that the distribution becomes more normal / gaussian.
Rolling a $M$-sided die would give you a multinoulli or categorical distribution, and is specified by a probability vector $\vec{p}$ of length $M$ where the $i$-th element is the probability you get side $i$. The number of times you got each side in of $n$ rolls which are i.i.d. would follow a multinomial distribution with parameters $n, \vec{p}$.
Central limit theorems are statements that say that averages of random variables converge in distribution to a Gaussian distribution. The basic statment is that if $\{X_i\}$ are i.i.d. with mean $\mu$ and variance $\sigma^2$, then $\frac{\sum_{i=1}^n (X_i - \mu)}{\sigma\sqrt{n}} \to N(0,1)$ in distribution, i.e. $\lim_{n \to \infty} P(\frac{\sum_{i=1}^n (X_i - \mu)}{\sigma \sqrt{n}} \leq c) = P(N(0,1) \leq c)$ where N(0,1) denotes a Gaussian with mean zero and variance 1. You can extend this to vectors. Central limit theorems tend to hold in a fairly wide set of cases and are a workhorse in probability/statistics (without them, and laws of large numbers, nothing would get done).
A Binomial(n,p) distribution has the distribution of a sum of n i.i.d. Bernoulli(p) random variables. Say $\{X_i\}$ is i.i.d. Bernoulli(p). So, by the usual basic central limit theorem, $\frac{\sum_{i=1}^n (X_i-p) }{\sqrt{np(1-p)}}$ converges in distribution to a Gaussian with mean $0$ and variance $1$. That is, $\lim_{n \to \infty} P( \frac{\sum_{i=1}^n (X_i-p) }{\sqrt{np(1-p)}} \leq c) = P(N(0,1) \leq c)$.
Note that things don't always converge to Gaussian distributions -- for example, if you have a sequence of Binomial$(n,p_n)$ distribution, and take the limit as $n\to \infty$ where $n p_n = \lambda$, the convergence in distribution is to a Poisson($\lambda$) distribution (the so called, law of rare events).
For more details, you can look at papers like Morris, Carl. Central Limit Theorems for Multinomial Sums. Ann. Statist. 3 (1975), no. 1, 165--188, which considers the case of averaging multi-sided die rolls.
The interplay between the number of sides (the length of $\vec{p}$), probability of sides ($\vec{p}$) and number of rolls (samples, $n$) can change the limiting distribution, just as in the Binomial case above.