I'am faced with a strange problem and I can't find a way out. Here it is:
We observe data $X(t)$ that depends on the time $t$. We make these observations at given times $t_0, \cdots, t_n$ for a $n \in \mathbb{N}$. At each of these times, we check if the data are normal using a normality test (Anderson-Darling, Shapiro-Wilk, D'Agostino) to make a capability study, i.e. to compute a Ppk (process performance index). The problem is that we observe normality for $X(t_0), \cdots, X(t_{k-1})$, no normality for $X(t_k)$ and again normality for $X(t_{k+1}),\cdots, X(t_n)$ for $0 < k < n$. Let us call $\mu_i$ and $\sigma_i$ the mean respectively the standard deviation of $X(t_i)$ for $i \in \{0,\cdots,n\}$. I observed $\mu_{k+1} < \mu_k < \mu_{k-1}$ and $\sigma_{k+1} < \sigma_k < \sigma_{k-1}$, letting me assume that the Ppk at time $t_k$ has to be somewhere between the one at $t_{k-1}$ and at $t_{k+1}$.
I was also thinking of showing that $(X(t))_{t \in \mathbb{R}}$ is a stochastic Gaussian process, but I suppose if I don't have normality at every step I cannot assume it.
Does anyone know how to get around this problem?
P.S: The definition I used for the Ppk is the following: $$ Ppk = \min\left( \frac{\mu-LSL}{3\sigma},\frac{USL - \mu}{3\sigma}\right), $$ where $USL$ and $LSL$ mean respectively upper specification limit and lower specification limit and are given.
How confident are you that $X(t_k)$ really is that unusual?
If you test $100$ different normal distributions for normality, the $p$-values will be $100$ different uniformly sampled values from $[0,1]$. As a result, you expect that, on average, $5$ of them will randomly happen to be significant at the $0.05$ level, and $1$ of them will randomly happen to be significant at the $0.01$ level. This does not mean that one of them is not normal - it is a natural result of doing $100$ null hypothesis tests.
So if the only reason you have of being suspicious of $X(t_k)$ is that it has a fairly low $p$-value, then you should probably stop being suspicious of it. The likeliest explanation is that you have a normal distribution at every time step, and at time $t_k$ it just happened to produce not-very-normal-looking data.
If your response is, "But $X(t_k)$ had a really really low $p$-value!!" then we can try to do some meta-analysis of the $p$-values to see how unusual it is that you'd get one that low over $n$ time steps. This is an entire statistical field that can't be summarized in a single answer.
But one conservative test is: if you are testing things at, say, a $0.01$ significance level, but your only reason to suspect $X(t_k)$ is that it had the lowest $p$-value out of $n$ time steps, then you should be comparing that $p$-value to $\frac{0.01}{n}$ instead. The probability that the most extreme $p$-value after you've done $n$ tests is less than $\frac{0.01}{n}$ is equal to $$1 - \left(1 - \frac{0.01}{n}\right)^n$$ which for any value of $n$ is somewhere between $0.01$ and $0.00995$. So if your most extreme $p$-value is less than $0.01$ but bigger than $\frac{0.01}{n}$, that's not really a reason to suspect $X(t_k)$ of not being normal. If, on the other hand, you got a $p$-value of less than $\frac{0.01}{n}$, then that's suspiciously low even for the most extreme of your $p$-values.