In "A Smooth Introduction to the Wavefront Set", the product of two distributions is defined as follows.
Let $u, v \in D'(\mathbb R^n)$. We say that $w \in D'(\mathbb R^n)$ is the product of $u$ and $v$, which we denote $w=uv$, if and only if, for any $x \in \mathbb R^n$ there exists $f \in C_0^\infty(\mathbb R^n)$ with $f=1$ in a neighbourhood of $x$ and, for any $\xi \in \mathbb R^n$, \begin{equation} \widehat{f^2w}(\xi)=\left(\widehat{fufv}\right)(\xi)=\frac{1}{(2\pi)^n}\left(\widehat{fu}*\widehat{fv}\right)(\xi)=\frac{1}{(2\pi)^n}\int\widehat{fu}(y)\widehat{fv}(\xi-y)dy \end{equation} is absolutely convergent,
where * denotes the convolution. They go on to show that, according to this definition, the product of the Heaviside step function $H$ with itself is well defined. They also claim that $H^n=H$ for any $n\in\mathbb N$, just like when we look at $H$ as a locally integrable function. They offer however no proof of this fact. How can I show it? Do I have to directly use the explicit form of the Fourier transform of $H$, $\hat H= -i\text{pv}\left(\frac{1}{\xi}\right)+\pi\delta$, or is there a quicker and easier way?
[Revised:] Your source says that such products "can be defined" ... but "will lose some properties". Like Leibniz' rule.
(The questioner already knew that the proper notion of "wavefront set" says that if the wavefront sets of two distributions are disjoint, then they have a "multiplication" (with good properties). The wavefront set of the step function is $\{0\}$ with both tangent cone directions... so, as usual, we cannot "square it". But the source suggests quite pointedly that this is too restrictive.)
As an example computation, with $H$ the step function and $f$ a test function identically $1$ on $[-1,1]$ and identically $0$ outside $[-2,2]$, we have $(fH)'=\delta + g$ where $g$ is a test function. Also, as the questioner noted, $fH$ is compactly supported, so has Fourier transform extending to an entire function. In particular, it is smooth on $\mathbb R$, and has a growth rate expressed by Paley-Wiener (-Schwartz). Then for $\xi$ away from $0$, the Fourier transform can be estimated by integration by parts: (suppressing constants... and abusing notation: this should really be written as a pairing of compactly-supported distributions against smooth functions...) $$ \widehat{fH}(\xi) \;=\; - \int {e^{-i\pi \xi x}\over -i\xi} (\delta + g(x))\;dx \;=\; {1\over i\xi} + \hbox{Schwartz} $$ Thus, the Fourier transform is asymptotic to $1/i\xi$.
Thus, the convolution $\widehat{fH}*\widehat{fH}$ in the attempted definition of $H\cdot H$, in the definition in the question, would certainly be absolutely convergent.
Thus, apparently, $H\cdot H=H$ has a chance to make sense?
It is not explicit in the question, but I wonder whether defining $H^2$ by $\widehat{f^2H^2}=\widehat{fH}*\widehat{fH}$ really succeeds, even if the convolution is absolutely convergent, because we do not know that the inverse Fourier transform of that convolution is divisible by $f^2$ (in any sense).
But if the only issue is convergence of the convolution, then, yes, $H^2=H$. And if we believe that associativity will still work, then $H^n=H$ in this sense. (I'm uneasy about it, though, ...)
EDIT: In response to a question/comment... in fact I am not at all confident that in any (useful distributional) sense $H^2=H$. My previous remarks show that the indicated convolution integral is indeed absolutely convergent, so at least the discussion can proceed. As already noted, it is not at all clear (to me, anyway) why the division by $f^2$ would be possible.
Still, if we grant that (somehow) $H^2$ is a distribution, then its smoothness away from $0$ shows that away from $0$ it is $H$. Classification of distributions supported at $0$ further would imply (it seems to me) that $H^2=H$... IF $H^2$ is a distribution at all (which I don't think is the case).
It occurs to me that another sort of "multiplication" of generalized functions is in the context of "trace theorems" (for Sobolev spaces). One aspect is that, sure, restrictions of continuous functions are always continuous, but/and restrictions (e.g., from $\mathbb R^n$ to $\mathbb R^{n-1}$) of $L^2$ functions need not even be defined everywhere. But, for example, the restriction of a $H^s(\mathbb R^n)$ functions with $s>1/2$ to $\mathbb R^{n-1}$ is in $H^{s-1/2+\epsilon}(\mathbb R^{n-1})$ for every $\epsilon>0$. In particular, it is in $L^2$.
So, thinking that multiplication of two (generalized...) functions $f,g$ on $\mathbb R^n$, should be restriction of $f\otimes g$ to the diagonal, if they are both in $H^{{n\over 2}+\epsilon}(\mathbb R^n)$, then that restriction is indeed in $L^2$, at least. Things like that.
But on $\mathbb R$, the smoothly truncated step function is in $H^{{1\over 2}-\epsilon}$ for every $\epsilon>0$ (as is visible from the asymptotics of its Fourier transform), but not in $H^{1/2}$, etc., so this does not provide a distributional multiplication for this example.