I went through the following formula and I want to make sure if I'm missing something. For a random variable $X$ there is a vector of related covariates $Y,$ then for high threshold $u$ the following is true $$E\left(\log\left(\frac{X}{u}\right).1_{X>u}\mid Y=y\right)=E\left(\log\left(\frac{X}{u}\right)\mid X>u,Y=y\right)P(X>u\mid Y=y)$$
where $1_{X>u}$ is an indicator function that takes the value $1$ if $X>u$ and $0$ otherwise. I don't understand why the probability is added?!
We illustrate a simpler example. Suppose $X \sim \operatorname{Exponential}(\lambda = 1)$, with density $$f_X(x) = e^{-x} \mathbb 1 (x > 0).$$ Then for $u > 0$, $$\operatorname{E}[X \mathbb 1(X > u)] = \int_{x=0}^u 0 e^{-x} \, dx + \int_{x=u}^\infty x e^{-x} \, dx = (u+1)e^{-u},$$ but $$\operatorname{E}[X \mid X > u] = \frac{1}{\Pr[X > u]} \int_{x=u}^\infty x f_X(x) \, dx = \frac{(u+1) e^{-u}}{e^{-u}} = u + 1.$$ The difference lies in the fact that the function $$X \mathbb 1 (X > u) = \begin{cases} 0, & X \le u \\ X, & X > u, \end{cases}$$ whereas $X \mid X > u$ does not allow for $X \le u$ because it is given that $X > u$.