So I understand that the central limit says that the distribution of averages approaches a normal as sample size increases and I'm fine with that. But when is it appropriate (what conditions and distributions) for using the normal approximation for a series of i.i.d's? I've seen it used with at least the exponential and binomial (a series of Bernoulli trials), and Poisson. My understanding is that these distributions have independence baked in from the beginning, but I'm assuming there's more to it.
I'm wondering if using the normal approximation in this case is another type of application of the central limit theorem or something else entirely. What's the justification?
To begin with, those distributions that you refer to do not have independence baked into them, that is that one can create non-independent exponentially distributed variables. The only things you actually need is that the sample you have comes from i.i.d variables and that the distribution that they come from is of finite variance! Take care this is not given for any distribution, for instance, the well-known Cauchy distribution does not have finite variance the rest of the distributions with non-finite variance can be found here. So to sum up if your sample comes from a finite variance distribution and the draws are independent and all come from this distribution then you are free to use the CLT.