A conditional probability problem where the next day depends on the last 3 days

346 Views Asked by At

For many years, Meteorologists have spent long visits (5 days) at the Bigtown. They have observed that, for three consecutive days, if there are EXACTLY two sunny days, the next day is a sunny day*, while half of all the cloudy days are followed by another cloudy day. Assuming days are either cloudy or sunny, estimate how many days have been cloudy in their last ten visits. Visits are not necessarily following each other.

''half of all the cloudy days are followed by another cloudy day'' so $P(C\mid C)=50\%$

Is this possible? If not why not?

*i.e. SSC -> S

SCS -> S

CSS -> S

where -> means followed by

3

There are 3 best solutions below

11
On BEST ANSWER

Note: edited since the question was clarified.

First, we cannot ever have two or more consecutive sunny days. If we do, then the next cloudy day (which may not be for some time, but will certainly be preceded by two sunny days) must be followed by two sunny days (since SSC->S, SCS->S), and the same for the next cloudy day after that, and so on. In particular, every future cloudy day is followed by a sunny day, contradicting the assumption that $50\%$ of cloudy days are followed by sunny ones.

Second, we cannot ever have two sunny days separated by a single cloudy day, since then the next day will be sunny, contradicting the previous point.

Now we know that in between every two sunny days there are at least two cloudy days. These cloudy days must include one which is followed by a sunny day, and at least one which is followed by another cloudy day. Since these two things are equally common, we must have exactly two cloudy days between every two sunny days.

Thus the unique sequence satisfying these properties is ...SCCSCCSCC... in which exactly $2/3$ are cloudy.

0
On

For when you get a mathematical answer, the following simulations suggest that the answer is around 2.22.

nclo=0
for (k in 1:100) {
ncloud=0
for (j in 1:10000) {
  chain=rep(0, 10)
  for (i in 1:10) {
    if (i==3) {
      if (chain[1]+chain[2]==2) {
        chain=rep(1, 10)
        break
      }
    }
    if (i>3) {
      if (chain[i-1]+chain[i-2]+chain[i-3]>=2) {
        chain[i]=1
      }
    }
    if (runif(1)>.5) {
      chain[i]=1
    }
  }
  ncloud=ncloud+(10-sum(chain))
}
nclo=nclo+ncloud/10000
}
nclo=nclo/100
> nclo
[1] 2.221082
2
On

Here is an attempt to solve the problem:

Let's define the problem by a Markov chain, where a state of the chain is defined by the state of last (most recent) 3 consecutive days observed at any point in time (e.g., the state represented by CCC can transition to any of the two states, namely, itself, i.e., CCC and CCS). The state transition diagram will look like the following then

enter image description here

Notice that we don't have all the probabilities given to us (those are circled in the above figure), but some of them can be inferred and some of them can't be. For example, given, P(SCC|CCC)=1/2, so, consequently, P(CCS|CCC)=1-1/2=1/2. But we don't know either of P(SSS|SSS) and P(SSC|SSS), these are marked by x in the above figure.

For the sake of completeness (and simplicity), let's assume uniform probabilities for the unknown ones, e.g., let P(SSS|SSS) = P(SSC|SSS) = 1/2.

Then the Markov transition matrix becomes the one as shown below:

enter image description here

Now, the Markov chain being connected and aperiodic, by Perron-Frobenius theorem, it converges to stationary distribution in the long run.

Let's compute the stationary distribution, using the following R code with the Markov transition matrix $T$:

S <- T
for (i in 1:100)
  S <- S %*% T

enter image description here

Which shows that in the long run the Markov chain will be in the states CSS, SSC, SCS and SSS with probabilities 0.2, 0.2, 0.2 and 0.4, respectively. But the proportion of sunny days in these states are 2/3, 2/3, 2/3 and 1, respectively.

Hence, the expected number of sunny days will be = 0.22/3 + 0.22/3 + 0.2*2/3 + 0.4 = 0.8