On Wikipedia, the definition of the dirac delta function is given as:

Suppose I have a function where at two points, the function goes to infinity. Given that the distance between the two points is $a$, if I take $a$ tends to 0, will I get a Dirac delta function? That is, when the two spikes "superimpose". The domain is [0,a] and the function is zero everywhere in between the two end points.

The Dirac delta as function takes continuous functions as arguments and returns their value at $x=0$. That is, another name would be "evaluation operator". In still another characterization, it is the unit of convolution, $f*\delta=f$.
If you want to think in real functions, you can only sensibly define approximations of the behavior of the delta distribution. A robust class of such approximations is given by function sequences $\phi_n\ge 0$ with $\int_{\Bbb R}\phi_n dx=1$ and $\phi_n(x)\to0$ for $x\ne 0$. Examples are $\phi_n(x)=n\phi_1(nx)$ where $\phi_1$ satisfies the above conditions.
A rectangular pulse of width $a$ and height $1/a$ located around the origin (meaning the center converges to zero for $a\to 0$) would also satisfy these conditions, as $a$ goes to zero, $1/a$ goes to infinity, and the shape of the graph stays rectangular.
What does not work is a function that is not integrable, and not a function in the strict sense. Your description of a function jumping to infinity and staying there for a positive length is such a non-function.
Manipulation of the Dirac delta as function follows intuitive rules, $\lim_{a\to 0}\delta_a+\delta_0=2\delta_0$. What you can not do is start with the informal characterization of infinity peaks, as it is not precise enough. An infinity peak can correspond to $0\delta$, $\delta$, $3\delta$ or something without any useful finite results.