I know three rules to check for normality, and verify estimates for standard deviation for sample means and sample proportions.
I. $n > 30 $
II. $np > 10; n(1-p) > 10$
III. $10\%$ rule ($n > 10\% N$)
What are the uses of each one? What if I picked 9 squirrels from a population of known standard deviation (not known shape)? Can I assess normality and standard deviation of the sampling distribution of the sample mean?
Rule I (the imfamous 'rule of 30') is sometimes given as a way to determine whether the critical value for a two-sided t test is near 1.96 (which would be the critical value for the corresponding z test).
For example, suppose you have data from a normal or nearly-normal population with sample size $n = 31,$ mean $\bar X = 11.2,$ and and SD $S = 3.25.$ The population mean $\mu$ and population SD $\sigma$ are both unknown. You want want to test $H_0: \mu = 10$ against $H_a: \mu \ne 10.$ Because $\sigma$ is unknown and estimated by $S,$ this is properly a t test. The $T$ statistic is $T = \frac{\bar X = 10}{S.\sqrt{n}} = 2.06.$ The critical value from a t table for a test at the 5% level, based on DF = 31-1 = 30, is $c = 2.049.$ Because $|T| = 2.06 > 2.049$ you can (just barely) reject $H_0.$
Some elementary texts say that it is OK to treat this t test as a z test because $n = 31 \ge 30.$ If it were a t test, then the critical value would be $c = 1.96$ and you would reject. If just happens that the critical values for $\mathsf{T}(30)$ and $\mathsf{Norm}(0,1)$ are about the same for tests at the 5% level. So the approximation works. But that does not mean that the test is really a z test. There are many valid objections for ever using this so-called' rule of 30:
The rule does not work at all for significance levels 1% [$n>100?$] and 10% [$n>15?$]. or not really for any level other than 5%.
If you are using software for a z test, you will be asked for $\sigma,$ which is unknown. And it is a lie to enter the sample SD $S$ instead.
If you go on to find P-values or to find the power of the test, there are more fundamental differences between t tests and z tests.
Students who go beyond the basic course have to unlearn this potentially misleading rule.
The one correct way to tell a t test from a z test is quite simple: If the population SD $\sigma$ is known it is a z test; if $\sigma$ is unknown it is a t test; and the sample size $n$ has nothing to do with the distinction.
Rule II is usually OK. It is ordinarily used to say whether it is OK to approximate binomial probabilities by using the normal distribution. The specific statement varies from text to text, depending on fussiness. Some say it is OK to use a normal approximation if both $np > 5$ and $n(1-p) > 5.$ Your version with $10$ instead of $5$ is more cautions. Both rules work better if $p \approx 1/2.$ No such rule-of-thumb works all of the time, but this one is pretty good. In any case, you should not expect more than two-place accuracy from a normal approximation to binomial. Two examples:
In R statistical software
n = 3; p = .5; sum(dbinom(1:2, n, p))returns the exact binomial probability 0.75.While the code
diff(pnorm(c(.5,2.5), n*p, sqrt(n*p*(1-p))))returns the very good normal approximation 0.7517869Rule III does not have a direct connection to normality. It shows when it is OK to use the binomial distribution (assuming sampling with replacement) when the hypergeometric distribution (modeling sampling without replacement) is exactly correct. (The only connection with normality is that, in some circumstances. both distributions might be approximated by normal. But this rule does not directly bear on whether the normal approximation is accurate.)
For example, suppose you want to know the probability of drawing exactly three Aces in five draws from a 52-card deck. This is sampling without replacement and the exact hypergeometric probability is 0.9982455, but three is less than ten percent of 48, so the approximate binomial probability 0.9998357 is very close.
In summary, I would never recommend Rule 1 for any purpose, Rule 2 is usually a reasonable way of guessing whether it is safe to use a normal approximation for a binomial probability, and Rule 3 has to do with the practical distinction between binomial and hypergeometric distributions, with no direct connection to normality.
Checking for normality: There are several ways to check whether a reasonably large sample might have come from a normal population. One is a normal probability plot (normal 'quantile plot' or 'normal Q-Q plot')
If the points on such a plot are randomly sampled from a normal population the points on such a plot will fall nearly in a straight line (with the understanding that straggling extreme points may fall quite a bit off the line). Here are such plots for three samples of size $n = 100:$ neither sample $X$ nor $Y$ comes from anormal population, but sample $Z$ does.
Another method is to use one of several tests for goodness-of-fit to a normal distribution. One of the best ones is the Shapiro-Wilk test. For the same three samples this test gave P-values 0.0000, 0.0004, and 0.9826, respectively (to four places). P-values below 0.1 are pretty good evidence for a non-normal population.
For a sample as large as $n = 100,$ a large Shapiro-Wilk P-value is usually evidence that the population is very nearly normal. With a sample size as small as your suggested $n = 9,$ it would be very difficult to say for sure whether or not the population is normal.