Statistical hypothesis testing and P-value

122 Views Asked by At

Exercise :

A company produces construction cables which have a power breaking point with mean value at $100kg$ and with standard deviation $6kg$. A new constructive method is believed to be able to make the mean power breaking point greater. The company decides to use the new constructive method if it indeed increases the mean power breaking point and for that reason, the new method was used for the production of $25$ cables which gave a mean value of a power breaking point of $102.796kg .$

(a) Make the correct hypothesis testing for $a=1\%$

(b) Which is the p-value of the testing ? What hypothesis must be made for the testing at (a) ?

(c) Two scientists carry out the exact same statistical testing and the one of them rejects and the other accepts the null hypothesis. How is that explained ?

(d) Find a confidence interval with trust 99% for the mean value of the power breaking point of the new method.

Attempt :

Now, (a) is alike the other problem I posted. so I am not re-writting big solution. The fact is, I am totally clueless for (b),(c),(d) which are also alike to the other problem. I really want to see how an exercise like that can be solved because it's an important part of our semester exams.

My other post, that was not answered can be found here.

Please, I would really appreciate any help or thorough explanation because I am trying to understand such exercises for my exam, which is tomorrow.

Thanks in advance for your time.

1

There are 1 best solutions below

4
On BEST ANSWER

Since you're preparing for a test, I'd like to address your questions more generally. When I was first learning statistics, I too felt I was missing the forest for the trees with all these various quantities. Here is the key concept for basic (stats-101) hypothesis testing:

Everything you need to know can be found by assuming the null hypothesis is true

The basic skill that is tested in intro stats hypothesis testing is your ability to derive the distribution of your test statistic assuming the null hypothesis is true. For Stats-101 courses, it invariably means you will be approximating this distribution by a normal distribution (read up on central limit theorem to see how to do this...its the theorem that justifies using the normal distribution and tells you how to estimate it).

So, for example (a) is asking if the observed sample mean (102.796) is a "rare" outcome assuming the data came from a distribution with mean 100 and standard deviation 6. So, you need to know what is the distribution of the sample mean assuming the null hypothesis is true (mean=100,sd=6) and given that your sample size is 25. Statisticians call this the "Null Distribution" since it is the distribution of your statistic assuming the null hypothesis is true.

Hint: central limit theorem.

If you've solved (a) by getting the null distribution, then (b), (c), and (d) are almost done.

For (b), make sure you know the definition of a p-value (it's one minus the percentile of your observed test statistic, where the percentile is calculated using..you guessed it...the null distribution). One caveat is that most people use "two-tailed" p-values, so you need to double the value you get from the above calculation...why? Because we assume our null is a normal distribution, hence it is symmetric about the mean, and therefore will have a lower tail area equal to the upper tail area.

If none of this makes sense, please read up on p-values, upper and lower tailed tests, and rejection regions before your exam -- you will miss a lot of points if you don't know these fundamental concepts.

For (c) -- this is a conceptual question: if the data and calculations are the same, what is left? Think about the steps required to reject/not-reject a hypothesis...what hasn't been spelled out here?

for (d) -- there are two ways to approach this. One is to memorize the confidence interval formula for the mean (shallow knowledge). The other is to recognize that a confidence interval for the mean is the interval of null hypothesis means that would not be rejected by this test. So, for example if the null hypothesis was mean=101, sd=6, would you reject it at, say, $\alpha=0.05$?

Yet another way to get the confidence interval is to derive it from your knowledge of the null distribution. Thanks to the central limit theorem, we can approximate our null distribution as follows (where $\mu,\sigma$ are the mean and standard deviation of our null hypothesis):

$$\bar{x}_N \sim \mathcal{N}\left(\mu,\frac{\sigma}{\sqrt{N}}\right)$$

What is special about the Normal distribution (and a few others you may learn about if you continue to pursue statistics) is that we can "invert" this formula to get a statement about an interval (this requires you understand how adding and multiplying a random variable by a constant will affect its mean and variance)

$$Z := \frac{\sqrt{N}(\bar{x}_N - \mu)}{\sigma} \sim \mathcal{N}\left(0,1\right)$$

So our transformed variable $Z$ follows a standard normal distribution, whose percentiles you can look up in tables. So, what is the central 95% interval of values for $Z$? Its just $[-1.96,1.96]$. So,

$$P(Z \in [-1.96,1.96]) = 0.95 \implies P\left(-1.96 \leq \frac{\sqrt{N}(\bar{x}_N - \mu)}{\sigma} \leq 1.96\right) = 0.95$$

Some work for you

Now, can you use some basic algebra to re-arrange $-1.96 \leq \frac{\sqrt{N}(\bar{x}_N - \mu)}{\sigma} \leq 1.96$ to something of the form $?? \leq \mu \leq ??$...if you can, you've just derived the confidence interval for the mean for data from a normal distribution (with known variance). Congrats...you've mastered a major portion of intro stats!

Wishing you the best in your studies! Keep at it.