In Tao's notes on time frequency analysis, the following theorem is stated (Theorem 5.5 of Part 1 of https://www.math.ucla.edu/~tao/254a.1.01w/):
Fix $\xi_0 \in \mathbf{R}$, and positive values $\delta_1,\delta_2$ with $\delta_1 \cdot \delta_2 \geq 1$. Fix a bump function $\psi \in C_c^\infty(\mathbf{R})$ supported on $[-1,1]$. Then define the kernel
$$ K(x) = \int e^{2 \pi i \xi \cdot x} \psi((\xi - \xi_0)/\delta_2)\; d\xi $$
and consider the convolution operator $T$ defined by the formula
$$ Tf(x) = \int K(x-y) \psi(y/\delta_1) f(y)\; dy. $$
One can see this operator as the composition of a spatial cutoff by a smooth function on $[-\delta_1,\delta_1]$, and then considering a smooth frequency cutoff on $[\xi_0-\delta_2,\xi_0+\delta_2]$. The result stated is that for $f \in L^2(\mathbf{R}^d)$, and any $n > 0$,
$$ |Tf(x)| \lesssim_n \| f \|_{L^2(\mathbf{R})} \delta_1^{1/2} ( \delta_1 d(x,[\xi_0-\delta_2,\xi_0 + \delta_2] )^{-n}. $$
The proof idea that Tao gives is relatively simple, assuming that $\delta_1 = 1$, then applying Cauchy-Schwartz to conclude that
$$ |Tf(x)| \lesssim \| f \|_{L^2(\mathbf{R})} \left( \int |K(x-y)|^2 |\psi(y)|^2\; dy \right)^{1/2}. $$
I assume the idea then is to use decay estimates for $K$ to obtain the required bound, but I'm not sure how to find these. Via this method, I was able to show that
$$ |Tf(x)| \lesssim_n \delta_1^{1/2} \delta_2^{1-n} \| f \|_{L^2(\mathbf{R})} d(x,[-\delta_1,\delta_1])^{-n}. $$
Is there a typo that makes what is meant to be proven equivalent to this bound, or am I missing something?
I will write $\delta=\delta_2$ and asssume $\xi_0=0$. Write $K(x)=\delta \phi(\delta x)$ where $\phi$ is the inverse Fourier transform of $\psi$. Heuristically, we may treat $K(x)$ as the indication function $$ \delta 1_{[-1,1]}(\delta x)=\delta 1_{[-\delta^{-1},\delta^{-1}]}(x). $$ Treat $\psi$ as the indicator function $1_{[-1,1]}(x)$. Then we see that $$ \int |K(x-y)|^2 |\psi(y)|^2 dy $$ is heuristically $$ \delta^{2}\delta^{-1}1_{[-1,1]}(\delta x). $$ More rigorously, we can replace all indicator functions by weight functions of the form $$ \mathrm{dist}(x,[-1,1])^{-n}. $$ You may also refer to Section 5 of this note:
http://www.math.ubc.ca/~toyang/Decoupling.pdf