Probabilities of two correlated random variables and its probable magnitude

606 Views Asked by At

1) Let's say we have a process defined by two random variables X and Y. Both processes are perfectly correlated +1, the vectors are defined as follow:

x =[1 2 3 4 5 6 7 8 9]
y = [100 200 300 400 500 600 700 800 900]

if one were to calculate the correlation matrix for the above:

[1 1]
[1 1]

By taking a linear regression to see the relationship between X and Y we get that two processes are correlated and one changes as a function of the other one by a factor (slope M) of 100.

Y = mx + b... 
Y = 100X

So one can say that the time series random variable Y is a function of random variable X.

Therefore, if one needs to describe the above process using probabilities, we can say that random variable Y will vary by 100, 100% of the times (the coefficient of determination on this case happens to be 1).

So here's the fun part...

2) Assume we have two new random variables X2 and Y2, the only things we know about these two random variables is that their correlation coefficient changes through time and can be modeled by:

C = sin(t) where t = seconds.

Making their correlation matrix look something like:

[   1  sin(t)]
[sin(t)   1  ]

And their linear relationship is defined by Y2 = mX2 + b where M = 2*sin(t+1) with a coefficient of determination R^2 = 1/2*sin(t) + 1/2

Given an interval between t=0 and t=30. if we were to do a summation of ALL elements between that interval of X2(t) and Y2(t), what would be the probability that sum(X2) > sum(Y2)

My approach so far:

Given t = 0,30 
Correlation Coefficient: sin(0) = 0, sin(30) = 0.5 (degrees, not radians)
Linear relationship between variables = 2*sin(0+1) = 0.034, 2*sin(30) = 1
R^2 (coefficient of determination) = 1/2*sin(0) + 1/2 = 1/2, 1/2*sin(30) + 1/2 = 0.75
  • So at point t=0, we know that since there's a 0 correlation (their linear relationship is completely random), so there is 50% probability that X2(t0) > Y2(t0) or vice versa, and there is a relatively low probability of Y2 varying 0.034*X2(t) because the coefficient of determination is 0.5, so it is 50% chance, thus random.
  • At point t = 30, we know that since there's a correlation coefficient of 0.5, there's a higher probability of X2(t30) and Y2(30) INCREASING or DECREASING at the same time with respect to X2(t0) and Y2(t0). The probability is 75% (because 0.5 is 75% closer to +1 than -1. AND, there is a probability of 87% that the relationship Y2(t) = 1 * X2(t) will hold because of the R^2 coefficient of determination of 0.75 (also 75% chance).

So the probability at t = 30 of BOTH variables being "BOTH negative OR BOTH positive" is 75%, AND the probability of X(30) and Y(30) being EQUAL is 75%. However, I do not know how X2(t0) and Y2(t0) behaves since they are completely random...

I feel there should be an integral somewhere, and calculate the probabilities based on a summation from t=0 to t=30 (with a definite integral). But I am not sure how would this work. Is there any way to model this process mathematically using probabilities?

Any ideas?

Thanks in advance.

1

There are 1 best solutions below

0
On

Without the distribution of the variables, it is impossible to infer figures from them, such as $\text{sum}(x2)>\text{sum}(y2)$.

Besides, the indicated sum is actually a (discrete) time sum $\sum_{t_i}^{t_f}P{x2}(x)dx>\sum_{t_i}^{t_f}P{y2}$.

This is explicitly showing the required probabilites of the ocurrence of each given values. Those probabilites need to be computed from the proper densities $p_{x2}(x)$ and $p_{y2}(x)$.

This is a sum rather than an integral, which for a discrete time stochastic process requires to compute the probabilities $P(t)$ for each discrete instant $t$.

If your application support it, you could define a continuous time stochastic process; In that case the probability is actually a differential term, and the sum turns into an integral, though it would require some differential stochastic relation in order to define the process, such as for a Wiener process.