following the wikipedia article of the dirac delta function I have the following question.
If we consider the dirac delta function as a measure (as it is described in the article) how can we proof the scaling property?
It seems to me that for a measure we have $\delta({A}) = \mathbb{1}_A(0)$ and thus $\delta(A) = \delta(kA)\neq \frac{1}{|k|}\delta(A) $
how to reconcile it with the proof (also in wiki article) that uses the substitution rule?
Indeed, the definition of "$\delta(ax)$" in Wikipedia should rather be understood as a change of variable (i.e. a composition) for distributions, which is aimed to extend the behavior of functions, so that one can manipulate $\delta(ax)$ as if it was a function.
Let me define for a function $g$ (for example continuous), $m_ag(x) := g(ax)$. Then for any test function $\varphi$ smooth and compactly supported (but continuous and compactly supported is enough) $$ \int_{\Bbb R^d} m_ag(x)\,\varphi(x)\,\mathrm d x = \frac{1}{|a|^d} \int_{\Bbb R^d} g(x)\,\varphi(x/a)\,\mathrm d x. $$ Therefore, if $f$ is a distribution on $\Bbb R^d$, one defines $m_af$ as the distribution such that for any test function $\varphi$, $$ \langle m_af, \varphi\rangle = \frac{1}{|a|^d}\,\langle f, m_{1/a}\varphi\rangle, $$ where $\langle f, \varphi\rangle$ denotes the action of $f$ on $\varphi$. In particular this applies to measures since they can be seen as distributions of order $0$ by Riesz's representation theorem. For the particular case of the Dirac delta $\delta=\delta_0$ which verifies $$ \langle\delta,\varphi\rangle = \int_{\Bbb R^d} \varphi(x)\,\delta(\mathrm d x) = \varphi(0) $$ it gives $$ \langle m_a\delta, \varphi\rangle = \frac{1}{|a|^d}\,\langle \delta, m_{1/a}\varphi\rangle = \frac{1}{|a|^d}\,m_{1/a}\varphi(0) = \frac{1}{|a|^d}\,\varphi(0) = \frac{1}{|a|^d}\,\langle \delta, \varphi\rangle $$ that is $m_a\delta = \frac{1}{|a|^d}\,\delta$ (your case being the case $d=1$ in dimenion $1$).
This is indeed not a natural operation to perform on sets. If we want to work only on measures acting on sets, then this is equivalent to say that for a measure $\mu$ we decided to define $$ m_a\mu(A) = \frac{1}{|a|^d}\int_{\Bbb R^d} \Bbb 1_{A}(x/a)\,\mu(\mathrm d x) = \frac{1}{|a|^d}\int_{\Bbb R^d} \Bbb 1_{aA}(x)\,\mu(\mathrm d x) = \frac{1}{|a|^d}\,\mu(aA) $$ with the notation $aA = \{x\in\Bbb R^d: x/a\in A\}$. So to summarize, to be compatible with the behavior of functions, it is better to define "$\mu(ax)$" as $$ m_a\mu(A) = \frac{1}{|a|^d}\,\mu(aA). $$