I'm confused about what exactly $Z_{\alpha}$ is, does there exist a formula for it in terms of $\alpha$? IF so, is there also one for $Z_{\alpha/2}$?
Confidence Itervals; $Z_{\alpha}$ & $Z_{\alpha/2}$
2.3k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
These are critical values for a hypothesis test conducted at a significance level $\alpha$.
When we say "critical value," what this refers to is a sort of "boundary" or criterion on the test statistic at which the decision on the hypothesis transitions from failure to reject the null, to rejecting the null.
The idea behind hypothesis testing is that it is a procedure by which we take observed data, calculate a test statistic from that data, and compare it to a critical value. Based on that comparison, a decision is made regarding that hypothesis--either the data furnishes enough evidence to reject the null (in which case, the test statistic has fallen within a critical region), or there is insufficient evidence to reject the null and the test is inconclusive.
If we use the notation $z_\alpha$, this refers specifically to a test statistic that is normally distributed with mean $0$ and variance $1$. The subscript $\alpha$, rather than $\alpha/2$, implies a one-tailed test. So $z_\alpha$ will be a quantile or z-score of a standard normal distribution, such that $$\Pr[Z > z_\alpha] = \alpha.$$ It is the value of a standard normal variable for which the probability of observing any value greater than this, is $\alpha$.
For example, $z_{0.05} \approx 1.645$, because for $Z \sim \operatorname{Normal}(0,1)$, $$\Pr[Z > 1.645] = 0.05.$$ Similarly, $z_{0.025} \approx 1.96$, since $\Pr[Z > 1.96] = 0.025.$ These are commonly-encountered critical values for significance levels of $5\%$ and $2.5\%$, respectively.
What is $z_{\alpha/2}$? This is the same thing, really, but the division of $\alpha$ by $2$ implies that this critical value is being used in a two-sided or two-tailed hypothesis test, in which the alternative hypothesis is that the parameter of interest is not equal to (different from) some hypothesized value.
How does this relate to confidence intervals? Certain confidence intervals arise as the "inversion" of a corresponding hypothesis test. A detailed discussion is not in the scope of this answer. Typically, though, we would say that a two-sided $100(1-\alpha)\%$ confidence interval for the mean of a population whose observations are normally distributed with some known standard deviation $\sigma$ can be expressed as $$\left[\bar x - \frac{\sigma}{\sqrt{n}} z_{\alpha/2}, \bar x + \frac{\sigma}{\sqrt{n}} z_{\alpha/2}\right],$$ where $\bar x$ is the sample mean of the observations, $n$ is the size of the sample, and $z_{\alpha/2}$ is the corresponding critical value.
If I assume that the weight of a particular breed of puppy at 6 months is normally distributed with standard deviation $\sigma = 3$ pounds, and I take a random sample of $n = 5$ puppies and observe their weights in pounds are $$\{8.5, 9.0, 11.5, 10.0, 13.0\},$$ then we calculate $$\bar x = 10.4$$ pounds, and this is the mean of the sample, but not necessarily the true mean $\mu$ of all such puppies. I could calculate a $95\%$ confidence interval of the true mean $\mu$ as $$\left[ 10.4 - \frac{3}{\sqrt{5}} z_{0.025}, 10.4 + \frac{3}{\sqrt{5}} z_{0.025}\right] = [7.77, 13.03].$$ This interval represents an interval estimate of $\mu$, which remains unknown to us. The width of this interval reflects a certain degree of uncertainty due to three factors:
$\sigma$ represents the intrinsic variability of puppy weights in the population. Even if we know what this variability is, it contributes to the variation in the sample we observed.
$n$ represents the amount of information we get from our data and therefore, the more observations we make, the better our estimate of where the true mean must be.
$z_{\alpha/2}$ in a sense reflects the extent to which we are willing to be wrong. It is the only factor that ties in to the confidence level. If we set a $95\%$ confidence level, that means that on average, 95 out of 100 confidence intervals we construct from a random sample of $5$ puppies will contain the true mean weight $\mu$. This means any given $95\%$ confidence interval we calculate has a $\alpha = 5\%$ probability of not actually containing the true mean. So, if we wanted to be $99\%$ sure, we would need to calculate $z_{0.005} \approx 2.5758$ and use this value instead; but the penalty is that the resulting interval is wider, which of course is necessary to reduce the chance we miss the correct value.
Commonly, $z_\beta$ refers to the point (number) such that $\Pr(Z\gt z_\beta)=\beta$. Here $Z$ is the standard normal distribution. So for instance $z_{\beta}$, where $\beta=0.005$, refers to the point $z$ such that $\Pr(Z\gt z)=0.005$. It turns out that in this case $z_{\beta}\approx 2.57$. So the probability that a random variable with standard normal distribution is bigger than $2.57$ is $0.005$. Thus the area under the standard normal curve in the right tail past $2.57$ is $0.005$.
This kind of information is needed when we calculate confidence intervals when the normal distibution provides a reasonable fit, and also in hypothesis testing in similar situations.
There is no nice formula for $z_\beta$ in terms of $\beta$. But many pieces of software, including standard spreadsheets, will compute it for you. There are also online calculators that will do it.
Before the computers everywhere era, people used tables of the standard normal to locate $z_\beta$ for the $\beta$ they were interested in, typically numbers like $\beta=0.05$, $0.025$, or $0.01$, and so on.
Anyone who has done statistics probably remembers forever that $z_{0.025}\approx 1.96$.