We consider $P$ the space of probability measures on $\mathbb{R}^{d}$ endowed with the topology of weak convergence.(note that this topology makes $P$ polish space)
For all $y \in \mathbb{R}^{d} $ we define the "translation by $y$ operator" as the operator $\tau_y : P \rightarrow P$ such that , for all $\mu \in P$, $\int \limits_{x \in \mathbb{R}^{d} } f(x) d\tau_y \mu (x) = \int \limits_{x \in \mathbb{R}^{d} } f(x+y) d\mu (x) $, for all measurable and bounded functions $f:\mathbb{R}^{d} \rightarrow \mathbb{R}$.
why is it called "translation operator" ? could someone give me an example that illustrates the choice of this name ?
Also, how can we prove continuity ?
It is easy to see why this operator is called a translation operator once you understand what the measure $\tau_y \mu$ actually is. By taking $f$ in the definition to be $1_A$ for some measurable set $A \subseteq \mathbb{R}^d$, we get $$\tau_y \mu(A) = \int_{\mathbb{R}^d} 1_A(x+y) d \mu(x) = \int_{\mathbb{R}^d} 1_{A-y}(x) d \mu(x) = \mu(A-y)$$ and hence $\tau_y \mu$ is really the translation of the measure $\mu$ by $y$ in the natural sense.
To show continuity, it is convenient to work with the sequential characterisation. So we want to show that for every bounded, continuous function $f$ and sequence of probability measures such that $\mu_n \to \mu$ in the topology of weak convergence, we have that $$\int_{\mathbb{R}^d} f d \tau_y \mu_n \to \int_{\mathbb{R}^d} f d \tau_y \mu$$ The result is then immediate by writing out the definition of integration against $d \tau_y \nu$ for a measure $\nu$ on both sides, since if $f$ is a continuous and bounded function then so is $f_y(\cdot) := f(\cdot + y)$.