What is the formal definition of the breakdown value of a statistic

68 Views Asked by At

On page 482 of Statistical Inference (Second Edition) by Casella & Berger, the authors define the breakdown value as follows:

Defintion 10.2.2 Let $X_{(1)} < \dots < X_{(n)} $ be an ordered sample of size $n$, and let $T_n$ be a statistic based on the sample. $T_n$ has a breakdown value $b$, $0 \leq b \leq 1$, if, for every $\epsilon > 0$,

$\lim_{X_{(\{(1-b)n\})} \rightarrow \infty} T_n < \infty$ and $\lim_{X_{(\{(1-(b+\epsilon))n\})} \rightarrow \infty} T_n = \infty$

where the round brackets $\{\cdot \}$ indicate rounding to the closest integer.

Now on the next page Casella & Berger state that the breakdown value of the mean is $0$, which is generally accepted, I think. But if I apply the definition, both of the limits would go to infinity, would they not?

I would appreciate if anybody could point out my error in understanding or provide a different formal definition. I am aware that the breakdown value is the proportion of the sample that can be changed without changing the statistic (very generally speaking).

1

There are 1 best solutions below

1
On BEST ANSWER

I've never seen this notion of break down point and maybe the definition should be extended so that $b=0$ if no $b>0$ exists such that the first limit is satisfied. Another notion of breakdown point that is (as far as I know) more common in the literature was proposed by Donoho here. Let $X_n=(x_1,\dots, x_n)$ denote a fixed sample of $n$ points and $X'_n$ denote an $\epsilon$-corrupted sample obtained by replacing an $\epsilon$ proportion of the original $X$ arbitrarily. Then let $T$ be some statistic and define the largest bias caused by $\epsilon$-corruption by $$ b(\epsilon; X,T) = \sup |T(X') - T(X)| $$

where the supremum is taken over all possible $\epsilon$-corrupted samples $X'$, and the breakdown point is then

$$ \epsilon^*(X, T) = \inf \{ \epsilon: b(\epsilon, X , T) = \infty \} $$

The definition here can be generalized by looking at other distances between $T(X)$ and $T(X')$, see for example the quantity defined in (2.2) here.