Discrete mathematics vs. non parametric statistics

216 Views Asked by At

Is there any meaningful connection betveen non parametric statistics and discrete mathematics?

I am reading this book: http://www.amazon.com/Discrete-Mathematics-Technology-Rowan-Garnier/dp/075030135X

I wonder if it is applyable in some way to chi-square test or kendall tau coefficient.

2

There are 2 best solutions below

3
On BEST ANSWER

There are many possible connections, for sure. The challenge is the fact that discrete mathematics works with discrete objects while the statistics rather prefers to work with continuous quantities.

Let me share one possible idea where discrete mathematics and non-parametric statistics can meet.

One of the most famous fields of non-paramteric statistics is local regression. Based on a data set $(x_i,y_i)_{1}^n$ we want to predict the $y$ for given $x\in X$. Typically, a weight $w_i$ is calculated for each record and based on these weights a weighted model is determined to predict $y$ for given $x$. These weights are usually calculated using a kernel function that reflects the Euclidean proximity of $x$ and $x_i$. However, it is possible to consider another proximity relation defined on $X$. If $X$ is discrete, it possible to define the proximity in terms of graph theory that belongs to discrete mathematics.

2
On

They key area of discrete mathematics that is used extensively in nonparametric statistics is combinatorics. Often, you are looking at the number of possible ways to order a set of objects such as ranks subject to constraints. The Mann-Whitney U test, Wilcoxon W, Permutation tests, sign tests...all require essentially combinatorial methods.

Per OP Comment

Many non-parametric tests assume exchangeability under the null hypothesis tests. Thus, each set of results has the same probability. As a simple example, permutation tests look at the value of the test statistic under all $n!$ combinations of the samples. A slightly more sophisticated use of combinatorics is for the Wilcoxon Test of a median, where you assume that the sign of a given absolute deviation from the hypothesized mean is equally likely to be positive or negative.