what is the correlation coefficient between x, y of the following function $f(x,y)=c$ , $y=1,2,..,x$; $x=1,2,..,n$

80 Views Asked by At

The probability function of the random vector $(x, y)$ is as follows:

$f(x,y)=c$, $y=1,2,..,x$; $x=1,2,..,n$

What is the correlation coefficient between $x$, $y$?

Note: $n$ is a constant number and $c$ is a constant that must be obtained in terms of $n$.

1

There are 1 best solutions below

5
On BEST ANSWER

enter image description here

Fig. 1 : the case $n=7$.

  1. The first thing to get is the value of $c$.

As you have a triangular disposition of points, there are $n(n+1)/2$ of them, each one with "mass" $c$ ; therefore we must have

$$c n(n+1)/2=1 \implies c=\dfrac{2}{n(n+1)}\tag{1}$$

Remark: you had obtained a "similar" formula $c x(x+1)/2=1$, but it cannot be $x$, because $x$ denotes a variable ; $c$ must be expressed as a function of $n$ as said in the text.

  1. Next, we compute the center of mass:

$$(E(X),E(Y)=(\overline{x},\overline{y})$$

As we have $1$ point with abscissa $1$, $2$ points with abscissa $2$, ... $n$ points with abscissa $n$, each point with weight c, we conclude that :

$$\overline{x}=\sum_{x=1}^n x^2 c$$

Using a classical formula:

$$\overline{x}= c \ \frac16 n(n+1)(2n+1)$$

Taking (1) into account:

$$\overline{x}= \frac{2n+1}{3}\tag{2}$$

$$\overline{y} = c(n \times 1 + (n-1) \times 2 + \cdots + 1 \times n)$$

$$\overline{y} = c \sum_{k=1}^n (n-(k-1))k$$

$$\overline{y} = c (n+1) \sum_{k=1}^n k - c \sum_{k=1}^n k^2$$

$$\overline{y} = c (n+1) \frac{n(n+1)}{2} - c \frac{n(n+1)(2n+1)}{6}$$

$$\overline{y} = c \frac{n(n+1)(n+2)}{6} $$

Taking (1) into account:

$$\overline{y}= \frac{n+2}{3}\tag{3}$$

Summarizing, we have

$$(\overline{x},\overline{y})= (\frac{2n+1}{3},\frac{n+2}{3})\tag{4}$$

Remark: An equivalent of (4) for large values of $n$ is $$(\overline{x},\overline{y})= (\frac{2n}{3},\frac{n}{3})$$

making a kind of first level verification because it corresponds to the coordinates of the center of gravity in the continuous equivalent case with a triangular domain ranging from (0,0) to (0,n) to (n,n).

  1. Now we are ready to apply the formula for the covariance, a step before the correlation coefficient:

$$cov(X,Y)=E(XY)-E(X)E(Y)$$

Now the computation of

$$E(XY)=\sum xy = \sum_{x=1}^n cx(1+2+...+x)$$

$$E(XY)=c\sum_{x=1}^n x\frac12 x(x+1)$$

$$E(XY)=\frac12c(\sum_{x=1}^n x^3+\sum_{x=1}^n x^2)$$

$$E(XY)=\frac12c\left(\left(\frac{n(n+1)}{2}\right)^2+\frac{n(n+1)(2n+1)}{6}\right)$$

$$E(XY)=\frac{c}{24}n^2(n+1)(7n+5)=\frac{1}{12}n(7n+5)$$

giving

$$cov(X,Y)=\frac{1}{12}n(7n+5)-\frac{2n+1}{3}\frac{n+2}{3}$$

I stop there. There are some more computations for the correlation coefficient. This exercise is so computational;... up to you for this last round.