The dot product formula is
$a \cdot b = \lvert a \rvert \lvert b \rvert \cos (\theta)$
This is the projection of $a$ onto $b$ scaled by $b$. Scaling by $b$ normalizes such that the magnitudes and not just the angle are taken into consideration when measuring vector alignment.
My question is about why we scale by $b$ for normalization instead of say, adding by $b$ where the dot product would be defined by...
$a \cdot b = \lvert a \rvert \cos (\theta) + \lvert b \rvert$
Wouldn't this be an equally effective means of normalization when measuring the similarity of two vectors? Is there something about scaling the quantity that is better for normalization?
Several reasons.
The main reason is that in linear algebra you want to use the dot product to define the angle, not the other way around. With your definition the angle between $a$ and $b$ would change when you doubled the length of $b$.
Algebraically, your definition destroys important properties of the dot product. Compare what it does to what it should do with $a \cdot (b+c)$ and $a \cdot b + a \cdot c$.