Note: We defined the Fourier transform as $$ \mathcal{F} f(\xi) := (2 \pi)^{-n/2} \int_{\mathbb{R}^n} f(x) e^{-i \langle x, \xi \rangle} dx $$
A theorem in our lecture notes states
Let $T \in \mathcal{S}'(\mathbb{R}^n)$ be a tempered distribution. For $h \in \mathbb{R}^n$ and $f \in \mathcal{S}(\mathbb{R}^n)$, the translation of $T$ is $(\tau_h T)(f) := T(f(\cdot + h))$ and $\mathcal{F}(\tau_h T) = e^{- i\langle h, \cdot \rangle} \mathcal{F} T$ holds.
The modulation of $T$ is $(M_h T)(f) := T(e^{i \langle h, \cdot \rangle} f)$ and $\mathcal{F}(M_h T) = \tau_h(\mathcal{F} T)$ holds.
I think there are incorrect signs. As the statement isn't proven, I lay out my own proof and would be happy if someone could tell me if they are correct and there's a typo in the theorem or my proof is incorrect.
Proof. For for all $f \in \mathcal S(\mathbb R^n)$ and $x \in \mathbb R^n$ we have \begin{align*} \mathcal{F}(\tau_h T)(f(x)) & = (\mathcal{F} T)(f(x + h)) = T((\mathcal{F} f)(x + h)) \\ & = T(e^{i \langle h, x \rangle} \mathcal{F} f(x)) = e^{i \langle h, x \rangle} (T\mathcal{F}) f(x) \\ & = e^{i \langle h, x \rangle} (\mathcal{F} T)f(x) \end{align*} and \begin{align*} \mathcal{F}(M_h T)(f(x)) & = \mathcal{F}(T(e^{i \langle h, x \rangle} f(x)) = T(\mathcal{F}(e^{i \langle h, x \rangle} f(x))) \\ & = T(\mathcal{F} f(x - h)) = (\mathcal{F} T)(f (x - h)) \\ & = \tau_{-h}(\mathcal{F} T)(f(x)) \end{align*}
Edit. The theorem also states: For $\varepsilon > 0$ the dilation of $T$ is $T_{\varepsilon}(f) := T(\varepsilon^{-n} f(\varepsilon^{-1} \cdot))$ for $f \in \mathcal{S}(\mathbb{R}^n)$ and $\mathcal F T_{\varepsilon} = \varepsilon^{-n} \mathcal F(T)(\varepsilon^{-1} \cdot)$ holds.
Does this mean $$ (\mathcal F T_{\varepsilon})(f(x)) = \varepsilon^{-n} \mathcal F(T) f(\varepsilon^{-1} x) \quad \text{or} \quad (\mathcal F T_{\varepsilon})(f) = \varepsilon^{-n} \mathcal F(T) (\varepsilon^{-1} f) $$ Can this then be proven like this? \begin{align} (\mathcal F T_{\varepsilon})(f(x)) & = T_{\varepsilon}(\mathcal{F} f)(x) = T(\varepsilon^{-n} (\mathcal F f)(\varepsilon^{-1} x)) = \varepsilon^{-n} T((\mathcal F f)(\varepsilon^{-1} x)) = \varepsilon^{-n} \mathcal F(T) f(\varepsilon^{-1} x) \end{align}
Be careful with the order in which you move operators from the distribution to the test function and reversely! The outermost operator is $\mathcal{F},$ not $\tau_h,$ so $\mathcal{F}$ should be moved first: $$ \begin{align*} \mathcal{F}(\tau_h T)(f(x)) &= (\tau_h T)(\mathcal{F}f(x)) = T(\tau_{-h}(\mathcal{F}f)) = T((\mathcal{F}f)(\xi+h)) = T(\mathcal{F}(e^{-i\langle h, x \rangle}f(x))) \\ &= \mathcal{F}T(e^{-i\langle h, x \rangle}f(x)) = e^{-i\langle h, x \rangle} \mathcal{F}T(f(x)) \end{align*} $$
Note:
For an ordinary function $f$ we define $\tau_h f$ by $(\tau_h f)(x)=f(x−h).$ For a distribution $T$ we define $\tau_h T$ by $(\tau_h T)(f)=T(\tau_{−h}f).$ Note that I here skip $(x)$ after $f$ since we actually apply a distribution on a function, not on its value at a point.
Note:
For ordinary functions we define $\tau_h$ so that $\tau_h f$ is $f$ translated to the right with a distance $h$. For distributions we do the same. But since distributions are not defined pointwise, we cannot just say $(\tau_h T)(x) = T(x-h).$ Instead we have to do the best we can: we transform the test function in such a way that we get the correct result when $T$ is an ordinary function and $T(f)$ is defined as $\int T(x) \, f(x) \, dx:$ $$\int T(x-h) \, f(x) \, dx = \{ y = x-h \} = \int T(y) \, f(y+h) \, dy,$$ i.e. $(\tau_h T)(f) = T(\tau f).$
Note
Often calculations become clearer if one introduces an operator instead of writing for example $\mathcal{F}(f(\varepsilon x)).$ Therefore, define the operator $\sigma_\varepsilon$ by $(\sigma_\varepsilon f)(x) = f(\varepsilon x)$ when $f$ is an ordinary function. If $T$ is an ordinary function we have $$ (\sigma_\varepsilon T)(f) = \int \sigma_\varepsilon T(x) \, f(x) \, dx = \int T(\varepsilon x) \, f(x) \, dx = \{ y = \varepsilon x \} \\ = \varepsilon^{-n} \int T(y) \, f(\varepsilon^{-1}y) \, dy = \varepsilon^{-n} \int T(y) \, \sigma_{\varepsilon^{-1}} f(y) \, dy = \varepsilon^{-n} T(\sigma_{\varepsilon^{-1}} f). $$
For the Fourier transform of a function we get $$ \sigma_\varepsilon \mathcal{F}f(\xi) = \mathcal{F}f(\varepsilon \xi) = (2\pi)^{-n/2} \int f(x) \, e^{-i\langle \varepsilon\xi, x\rangle} dx = (2\pi)^{-n/2} \int f(x) \, e^{-i\langle \xi, \varepsilon x\rangle} dx = \{ y = \varepsilon x \} \\ = (2\pi)^{-n/2} \int f(\varepsilon^{-1}y) \, e^{-i\langle \xi, y \rangle} \varepsilon^{-n} \, dy = \varepsilon^{-n} (2\pi)^{-n/2} \int \sigma_{\varepsilon^{-1}}f(y) \, e^{-i\langle \xi, y \rangle} dy = \varepsilon^{-n} \mathcal{F}(\sigma_{\varepsilon^{-1}}f) $$
Therefore, $$ \mathcal{F}(\sigma_\varepsilon T)(f) = \sigma_\varepsilon T(\mathcal{F}f) = \varepsilon^{-n} T(\sigma_{\varepsilon^{-1}}(\mathcal{F}f)) = T(\mathcal{F}(\sigma_{\varepsilon}f)) = \mathcal{F}T(\sigma_{\varepsilon}f) = \varepsilon^{-n} \sigma_{\varepsilon^{-1}}\mathcal{F}T(f), $$ i.e. $\mathcal{F}(\sigma_\varepsilon T) = \varepsilon^{-n} \sigma_{\varepsilon^{-1}}\mathcal{F}T.$