I read a book about statistics where the author wrote:
If a type-I error is costly, meaning your belief that your theory is correct when it is not could be problematic, then you should choose a low value for $\alpha$ to avoid that error [41]. For example, that the blood pressures of a group that took a new drug are significantly lower than those of a group who took a placebo would be considered “costly” (risky for patients), and a low $\alpha$ should be adopted.
My understanding is that if $\alpha$ is lower, the chance of occurrence of Type 1 errors is reduced. When the situation is risky, we should lower $\alpha$ to prevent the occurrence of Type 1 errors. Am I right?
Why did the author write: "meaning your belief that your theory is correct when it is not could be problematic, then you should choose a low value for α to avoid that error?" I think there is an typing error because of "when it is not could be problematic". I think it should used it could be problematic. Am I right?
The significance level $\alpha$ in a hypothesis test controls the tolerance for Type I error--it represents the maximum amount of a specific kind of error one is willing to tolerate in order to make a statistical inference. This specific type of error is that of the test that rejects the null hypothesis when the null hypothesis is actually true.
The alternative hypothesis is often also called the research hypothesis--it is the statement that, in order to accept it, must have substantial evidence to support it. That does not mean it is always true: any meaningful hypothesis test must allow some possibility of erroneously rejecting the null in favor of the alternative (Type I error), and we control this possibility by choosing a suitable $\alpha$. But this only serves as an upper limit on the error--once the data is observed, the true Type I error could be quite a bit smaller.
As for the phrasing of the quoted statement, it is correct. The syntax is this:
I added brackets to show that the clause is "that your theory is correct when it is not."
While you are correct that choosing smaller $\alpha$ will decrease your Type I error, this choice (nearly always) comes at a cost. Either:
I will illustrate this with an example. Suppose I have a coin. I know whether it is fair or not, but I don't tell you. I let you borrow the coin, but only for a minute. You decide to perform an experiment so that you may make an inference about whether the coin is fair, but ultimately, there is no way to know for sure unless I tell you.
In the limited time that you have, you are capable of flipping the coin $n = 10$ times, and you count the number of heads you get. If the coin is fair, you'd expect roughly the same number of heads as tails, but of course, if you got $6$ heads and $4$ tails, that wouldn't be a deal-breaker. In fact, even a fair coin would have a $1$ in $512$ chance of flipping $10$ out of $10$ heads, or $10$ out of $10$ tails. It's not likely, but it's certainly possible, even by pure random chance. In fact, if I lent the coin to $1000$ people, the chance that at least one person gets $10$ in a row of the same outcome (i.e. all heads, or all tails), is nearly $86\%$.
But back to you. You decide that, given your limited chance to flip the coin, you are going to make the following rule about when to conclude the coin is unfair: you will say, "the coin is probably unfair if I get $0$, $1$, $9$, or $10$ heads out of $10$ tries." After all, this seems quite reasonable--many people might even say that $2$ or $8$ heads is already quite extreme and suggestive of an unfair coin.
Yet, as I have already stated, the chance that even a fair coin could produce an extreme result is not zero. What is the exact chance that the coin, if actually fair, could cause you to mistakenly conclude that it is not, based on your rule? Well, the probability of such an outcome is $$\frac{1}{2^{10}} \left(\binom{10}{0} + \binom{10}{1} + \binom{10}{9} + \binom{10}{10} \right) = \frac{1 + 10 + 10 + 1}{2^{10}} = \frac{11}{512} \approx 0.02148.$$ That's just a bit over $2\%$ of the time, you would see a fair coin give you a result that would cause you to claim it is unfair. And this would be the Type I error of your test.
Well, now here's a twist. You make a wager with your friend. If you make a Type I error--you say the coin is unfair when it is actually fair--then you have to pay your friend 1000 dollars. But if you correctly conclude the coin is unfair, your friend pays you only 10 dollars. Now you have a strong incentive not to be so hasty to call the coin unfair. If you do, you want to be very confident in such a conclusion. Would you be willing to be wrong $2\%$ of the time when the reward is so low and the cost so high?
So you go back and you rethink your rule. Instead of allowing $2$ heads or $8$ heads to cause you to call the coin unfair, you decide you will only call the coin unfair if it shows all heads, or all tails. Now you've reduced your Type I error to $\alpha = \frac{1}{512} \approx 0.001953$, just under $0.2\%$. It could still happen that you'd be terribly unlucky and end up having to fork over 1000 dollars. But now, you're also less likely to win 10 dollars, too, because maybe the coin is unfair, but not so much so that it will always show one side.
What if you wanted to really reduce your chance of Type I error? Perhaps the consequences of such an error is extremely terrible (e.g., someone could die). So you say that even $\alpha = 0.001953$ is too big a chance to take. Instead, you want to set $\alpha = 0.0001$--that is, to have at most $1$ chance in $10000$ to call the coin unfair when it is actually fair. Could you make a rule to achieve this?
Well, the most extreme results that you can possibly observe are $0$ heads or $10$ heads, and you already saw that $\alpha = 0.001953$ for that rule. If your rule were made any more strict, you would have to say "I'm never going to conclude the coin is unfair." But then your experiment becomes meaningless because it never admits the possibility of detecting that the coin is unfair.
Clearly, then, in order to devise a test that has the desired Type I error tolerance, you need to flip the coin more than $10$ times. If you work fast, you might get in $30$ flips. Now can you devise a test with the desired $\alpha$? Yes. If you work out the math, you will find that a rule of "fewer than 5 heads or more than 25 heads out of 30 tosses = unfair coin" will have a Type I error lower than $0.0001$. What is the $\alpha$ if you also include outcomes of $5$ or $25$ heads as reason for rejection?