Does kernel regression preserve monotonicity?

188 Views Asked by At

Consider the Kernel regression estimator:

$$\hat{y}(x)=\frac{\sum_{i=1}^n{K(x-x_i)y_i}}{\sum_{i=1}^n{K(x-x_i)}},$$

where $x,x_1,\dots,x_n\in\mathbb{R}^d$, $y_1,\dots,y_n\in\mathbb{R}$, where $K:\mathbb{R}^d\rightarrow(0,\infty)$ is a strictly positive valued, differentiable kernel function, with a unique maximum at $0$.

Suppose further that for all $i,j\in\{1,\dots,n\}$, if $x_i\le x_j$ then $y_i \le y_j$.

Is it the case that for all $x\in\mathbb{R}^d$:

$$\frac{\partial\hat{y}(x)}{\partial x} \ge 0?$$

It seems obvious in the $d=1$ case, but even there I haven't been able to prove it. It's unclear to me if it holds for $d>1$. If it only holds under additional assumptions on $K$ I'd be interested in them.


Notation:

$\frac{\partial \hat{y}(x)}{\partial x}$ is the column vector of partial derivatives of $\hat{y}(x)$, i.e. the (transposed) Jacobian.

For vectors $a=[a_1,\dots,a_d]^\top$ and $b=[b_1,\dots,b_d]^\top$, $a\le b$ if and only if $a_i \le b_i$ for all $i\in\{1,\dots,d\}$.

1

There are 1 best solutions below

0
On

The below is taken from Iosif Pinelis's answer here: https://mathoverflow.net/a/358137/96912


The result is false in general even for $d=1$. E.g., let $$K=f_r+f_s,$$ where $f_t$ is the density of $N(0,t^2)$. Then for $x_i=y_i=i$ ($\forall i=1,\dots,n$) and $$(n,r,s,x_*)=\Big(3,\frac{427}{215},\frac{1}{1547},\frac{472}{473}\Big)$$ we have $$\hat y'(x_*)=-527.1\ldots<0.$$

Here is the graph $\{(x,\hat y(x))\colon\frac{471}{473}\le x\le\frac{475}{473}\}$:

enter image description here

We see a very narrow dip.


However, $\hat y'\ge0$ if $K$ is log concave. Indeed, letting $$k_i:=K(x-x_i)\quad\text{and}\quad k'_i:=K'(x-x_i),$$ we have $$ \begin{aligned} 2\Big(\sum_{i=1}^n k_i\Big)^2\hat y'(x) &=\sum_{i,j=1}^n(k'_i y_i k_j-k_i y_i k'_j+k'_j y_j k_i-k_j y_j k'_i) \\ &=\sum_{i,j=1}^n(y_i-y_j)\Big(\frac{k'_i}{k_i}-\frac{k'_j}{k_j}\Big)k_ik_j\ge0, \end{aligned}\tag{1} $$ because $y_i$ is increasing in $i$ and $$\frac{k'_i}{k_i}=(\ln K(x-x_i))' \tag{2}$$ is increasing in $i$; the latter holds because $x_i$ is increasing in $i$ and $(\ln K)'$ is decreasing (since $K$ is log concave). (In particular, any normal density is log concave.)



In the case $d>1$, the desired monotonicity fails to hold in general even when $K$ is log concave. E.g., let $$K(u_1,\dots,u_d):=\exp\{u_1u_2-u_1^2-\cdots-u_d^2\},$$ $n=2$, $x_1=(0,\dots,0)$, $x_2=(0,1,0,\dots,0)$, $y_1=0$, and $y_2=1$. Then $(\partial_1\ln K)(u_1,\dots,u_d)=u_2-2u_1$, where $\partial_1$ denotes the partial derivative with respect to the first coordinate. On the other hand (cf. (1) and (2)), $(\partial_1\hat y)(0,\dots,0)$ equals $l'_2-l'_1$ in sign, where $$l'_1:=(\partial_1\ln K)(0,\dots,0)=0$$ and $$l'_2:=(\partial_1\ln K)(0,-1,0,\dots,0)=-1<0=l'_1.$$ So, $(\partial_1\hat y)(0,\dots,0)<0$, as claimed.