Understanding memoryless property

Question

Understanding memoryless property

135 Views Asked by Bumbble Comm At 06 Apr 2026 - 5:45

(the old post grew too long so I made a separate question)

With exponential probability, $\Pr(X>s+t | X>s) = \Pr(X>t)$

where X is a waiting time for some event.

Now you estimate $p_1 = \Pr(X>s)$, at time $t_0$ When you reach time s and the event not happened yet, so you update your estimate $p_1 = \Pr(X>s)$ since $X$ is memoryless.

Now your past self could have foreseen the reasoning of the future self's reasoning.

If I don't see $X$ happening for $s$ period, then I'd have to wait for $X$ for another s period with equal probablity $p_1$. $p_1 = \Pr(X>s+s|X>s)$

Oh, then, I realize I can apply this reasoning more times. $\Pr(X<s+t|X>s) = \Pr(X<t) $ which can be derived from $1-\Pr(X>s+t|X>s) = 1- \Pr(X>t)$

Since $X$ happening $0<X<\mathrm{d}t, \mathrm{d}t<X<2\mathrm{d}t, 2\mathrm{d}t<X<3\mathrm{d}t$ is disjoint, I can add the probabilities to get the probability of union $0<X<3\mathrm{d}t$

$p_2 = \Pr(X<\mathrm{d}t)+\Pr(X<2\mathrm{d}t|X>\mathrm{d}t)+\Pr(X<3\mathrm{d}t|X>2\mathrm{d}t) = 3\cdot \Pr(X<\mathrm{d}t) = \Pr(X<3\mathrm{d}t)=\Pr(X<s)$ where $3\mathrm{d}t=s$.

Since pdf is thicker in front, I guess $\Pr(X>s) =1-p_2 < p_1$

(This can't be.. The reasoning must be flawed somewhere...)

Where you have taken some time, you have fewer time left, but $\displaystyle\int_0^{\inf} $ assumes the other.
And it is stating your world time is freshly reborn with the same probabilitic structure as far as the event is concerned.. (ok this doesn't sound mathematical expression..)

I am trying to understand. $Pr(a<X<2a)$ from current time is different from $Pr(0<X<a)$ from time a, although X is refering to the same time frame. So difference seems to spring from where you are standing, not what you are looking..

I can compute future self's $Pr(X<a)$ as current self $Pr(a<X<2a|X>a)$ because it's memoryless.
But somehow asserting $Pr(X<a) = Pr(a<X<2a)$ (this would mean, Pr(a< X<2a) doesn't depend on where X happend in < a, because it's memoryless) is false and not what memoryless means.
$Pr(a<X<2a)$ must mean your weight of belief X happening for the first time in between a<X<2a.
So, we are describing memoryless property with non-memoryless thinking (pdf).
The way we express the idea is subtle. so confusing..

I must feel I can construct some contradiction out of it.
Although I can see the practicality of reasoning memorylessly, and it could also be a non contradictory assumption, I guess I feel at least there must be an alternative perspective on the assumption or interpretation on what assumptions we are making, I want to know if there's other perspective indeed, or my thinking can be proven falsy.

Original Q&A

There are 2 best solutions below

Bumbble Comm On 25 Apr 2023 - 4:18

I think Ian's answer is correct. Just want to dive in where the misconception occur or want to discover or understand the underlying assumptions which I feel hidden somewhere.
(I guess that's my expression of I feel uneasy about something..)

So mathematically (and what Ian said equivalently)

Following events are disjoints and can be unioned $$0<X<a, a<X<2a, 2a<X<3a$$ Following events are dijoints but cannot be unioned $$0<X<a, a<X<2a|X>a, 2a<X<3a|X>2a$$

$P(X<a) + P(a<X<2a) + P(2a<X<3a) = P(3a) $ $P(a<X<2a) = P(a<X<2a|X>a) * p(X>a) = (1-P(X>a))*P(X>a)$

So what it's saying is that, we have a model where

Probability of waiting time happening between $a<X<2a$ is estimated by probability of waiting time happening $X<a$ * probability of waiting time happening $X>a$

So I interpret it as, the probability of waiting (a, 2a) estimated now (in a time frame now to infinity), is equal to.
probability of waiting (0, a) (in any time frame) * probability I believe I will be in (a, infinity) time frame.

And the person in (a,infinity) time frame would estimate his (0,a) event as just the same as I do in (0, infinity) and the estimation would be the same.

$a<x<2a = a<y$ $\cap$ $x>a$ and they are independent $P(a<x<2a) = p(a<y) * p(a>x) = p(a<x) * p(a>x)$

So what is memoryless here, when we talk about coin being memoryless, we can imagine or attribute coin to be memoryless.
But I'm not sure what I can attribute the memoryless property to. (our memory? time?)

$$P(a<x<2a | x>a) != P(a<x<2a) != P (a<x<2a | !(x>a)) $$ One could think how long I waited doesn't matter and think above equalities hold.

The idea can be expressed by $$P(a+a<x<a+2a | x>a) = P(a<x-a<2a | x-a>0) = P(a<x<2a) = P(a<x<2a|x>0)$$

(Important learning for me was that when we say conditional, we have to imagine a new sample space)

ok it checks out. But question still remains.

I feel I just found a plausible explanation for a part, but there's still uneasiness left.
It feels like I can extract alternative interpretation about how we view the world because of something fishy going on with infinity

For a start, P(X>a) at time 0 and P(Y>0) at time a, I believe I am mostly in a mode to think they are equal event and they have to be equal.

Then to express this, person in time a would need to express $$Y = X+a$$

Then the person (of time a) would calculate probability according to an object timeline which started at time 0.
And I guess this method would not lead to any conflict using the definition of the pdf of Exp.

But then question arises, for him, Y<0 is not relevant, and he can think of space where y>0 only.
and P(Z<a) would not lead to any conflict and will be equal to P(Z<a) = P(Y<2a) = P(X+a <2a)

So far so good.

Then person1 at time 0 starts waiting and he's gonna wait for 2a at max, and he estimates his probability of success (end of waiting) before 2a as $P(X<2a)$

person2 at time 0 does nothing until time a and starts wait for event and he estimates his probability of success before 2a as $P(Y<a)$

Now person1 who started from time 0 reached time a and recalculates the probability of end of waiting before 2a which he expressed as $P(X<2a)$

$$P(X<2a|X>a) = P(X<a) = P(Y<a)$$

or he might wonder why he can't do $$P(X<2a) - P(X<a) = P(a<X<2a)$$

A person-Y comes along and says

$$P(X<2a|X>a) = P(X-a<a|X-a>0) = P(Y<a|Y>0) = P(Y<a)$$

and

$$P(X<2a) - P(X<a) = P(X-a<a) - P(X-a<0) = P(Y<a) - P(Y<0) $$

and for someone with time >a, P(Y<0) = 0 but, for someone with time<a, P(Y<0) > 0

or alternatively
$P(X<2a) - P(X<a)$ you are accessing your belief as the past person at time 0
and $P(X<2a|X>a)$ you are accessing your belief as the current time at a

So in a sense, you have to forget what your belief $P(X<2a)$ was, and just look ahead what you are expecting.
You have to be memoryless as a coin not remembering the past.
(X thinks.. Probably what Y means is that the thing you are waiting for is memoryless not your memory.)

So then, a way to interpret it is..

I get it, I have to forget what my past expectation was, I just to recalculate as the formular says to express my expectation now.

$$P(X<2a) - P(X<a) = P(a<X<2a)$$

was my expectation at time 0 for the same event happening (where a<X<2a at objective timeline starting at 0)

I would be fool to use the past belief to express my current belief.
But then, if someone at time 0 was holding his feet at the timeline at time 0, and sees events happening as time goes along.

He may well use $P(X<2a) - P(X<a)$ to express his belief.
From his perspective, $P(X<2a) - P(X<a)$ is happening only to people who did not see $X<a$ happening.

So the uneasiness might come from,

I'm just accustomed to construct objective world view and stay there than to construct a new world view. (When event A with p1 or $A \cup B$ where $A \cap B = A$ with p2 can happen, if A doesn't happen, it's easy to expect B-A with p2 - p1. It is harder to think of $B-A|~A$)
and the another (objective) world view that coin has no memory is also valid.
And the thing I'd like to express (my belief of something happening) can be different.
But we usually want to express belief of something happening at the moment of making the belief. (this is easy part)

It is harder to realize $P(X<2a) - P(X<a)$ is the same as trying to express belief from not-current-me. but past-me-with-different-picture.

Also, it is easier to accept $P(X<a) = P(X<2a|X>a)$.
I can also accept $P(X>a) = P(X>2a|X>a)$ as a mathematical model, but it's hard to accept that as something applicable to concrete (I guess I can see it being applied as a model, but not sure if there's such thing can manifest the model as reality.. I can't say more than "nothing seems infinite" although I can't say for sure.. but my math knowledge is short to say more..)

**Bumbble Comm** · Accepted Answer

There are some concepts being blurred together here.

Let's start with this point about estimation. Without getting into Bayesian thinking, there is no way to use the information that a single sample from an exponential distribution exceeds $t$ in order to get some kind of useful estimate for the rate or mean. Either way the probability that $X>t$ increases as the rate of events occurring decreases, so your best estimate for the rate is just zero, regardless of how long you have been waiting. (With Bayesian thinking you can start from a guess that the rate is something and gradually adjust your guess as you wait.)

Second, you have this observation that $P(X<t)=P(X<(k+1)t \mid X>kt)$ for all $k$. This is true. The place you have an error is $3P(X<dt)=P(X<3dt)$. Everything you wrote in that line before that is true, but that is not; in fact $3P(X<dt)$ needn't even be less than $1$. You can't convert from a sum of probabilities of disjoint events to a probability of the union in this situation because the conditions aren't the same between the different events.

The last thing is about what's going on with the conditioning. What is going on intuitively is that if you have already been waiting for an amount of time $t$ (mathematically represented by the condition $X>t$ in a conditional probability), you have the same prediction for how long you'll continue to wait. Mathematically this new prediction is $P(X>t+s \mid X>t)$ and the old prediction that it matches up with is $P(X>s)$. However:

You still can't do any estimation here without a data point
Poisson processes are just a model. In reality $f(s)=P(X>t+s \mid X>s)$ generally decreases with $s$ at least to some extent.

Understanding memoryless property

There are 2 best solutions below

Related Questions in PROBABILITY

Related Questions in INTUITION

Trending Questions

Popular # Hahtags

Popular Questions