Let $X_i \sim \mathrm{Exp}(\lambda_i)$, $i = 1,2,3$ be independent, find $\mathsf E(\max(X_i) \mid X_1<X_2<X_3)$
I have found out two solutions as follow:
I am wondering if someone could explain this solution 1 intuitively while referring to the memoryless property of the exponential distribution. I am not able to connect them with M.L.P.
For solution 2, can anyone explain why $\mathsf E(X_2 - X_1 \mid X_1 \lt X_2 \lt X_3)$ implies $\mathsf E(X_2 \mid X_2 \lt X_3)$, why we could drop the $X_1$. Also, why $\mathsf E(X_3 - X_2 \mid X_1 \lt X_2 \lt X3)$ implies $\mathsf E(X_3)$ and why we could drop the $X_1 \lt X_2 \lt X_3$
I know those two solutions are both connected to the memoryless property of the exponential distribution, but I couldn't get through them. Hope someone could make it clear for me. Any help will be appreciated, Thanks in advance :)


The quantity $\lambda_1+\lambda_2+\lambda_3$ is the "rate", i.e. the average number of arrivals per unit of time, when three processes are running, with rates $\lambda_1$, $\lambda_2$, and $\lambda_3$. After the first arrival, only two more arrivals can happen, and those have rates $\lambda_2$ and $\lambda_3$. After the second arrival, only one more can happen, with rate $\lambda_3$.
The expectation $\operatorname{E}(X_2-X_1\mid X_1<X_2<X_3) = \operatorname{E}(X_2-X_1\mid X_1<X_2<X_3)$ is the same as $\operatorname{E}(X_2\mid X_2<X_3)$ because once the first arrival has happened, memorylessness says the probability distribution of the remaining time until the second arrival does not depend on how long one has waited for it so far. So it's the same as if you had started the whole thing running only the second and third processes.