Can someone explain me what is Normalization in Layman's term ? If we have a vector a, we normalize it by dividing it by |a|. That is $$\frac {a}{|a|} $$
Why we need normalization?
Can someone explain me what is Normalization in Layman's term ? If we have a vector a, we normalize it by dividing it by |a|. That is $$\frac {a}{|a|} $$
Why we need normalization?
On
Why do we need normalisation?
Mostly for simplifying things. You can think of a vector as consisting of two pieces of information - its magnitude ("length") and argument ("direction"). Normalisation is tantamount to throwing away the information related to the magnitude, grouping together all vectors that point in the same direction.
In this way, you obtain a set of unit vectors - exactly one for each possible "direction", a simplified set that you can study in contexts where magnitude is not important. Due to its simplified nature, this set will be a lot more tractable and malleable - for finite-dimensional vector spaces, for example, the unit n-sphere (the set of unit vectors) will be compact. Compactness is an exceptionally useful property that will enable us to draw fruitful conclusions about the topology and structure of the unit sphere.
Additionally, vector spaces are homogeneous, which means structurally invariant under transformations that (continuously) shrink/stretch the magnitude of vectors without affecting the direction. This means that, often, results that hold for the unit sphere (or individual unit vectors) will hold for the rest of the space (or any arbitrary vector) also. To summarise: in certain cases, we can throw away information (magnitude), draw conclusions by analysing the simplified set, and then recover the information that we lost by homogeneity. This can be a very fruitful approach!
Throwing away information about the magnitude can also help by simplifying calculations. We often choose to normalise our basis vectors in order to introduce an extra, rotational symmetry into the structure of our vector space (if the basis vectors all have the same magnitude, then one "step" in any direction spans exactly the same "distance"). Homogeneity means that we do not lose any information by making this choice.
On
Normalizing simplifies many calculations. For example the cosine of the angle between two vectors $u$ and $v$ is
$$\cos(\theta) = \dfrac{u\cdot v}{|u||v|} = \dfrac{u}{|u|}\cdot \dfrac{v}{|v|}$$
So if we know $u$ and $v$ are already normalized, then we only need to take their dot (or other inner) product, $\cos\theta = u\cdot v$. If many calculations are being performed, it is better to normalize. Similarly
$$\sin(\theta) = u\times v$$
Which gives
$$u\cdot v = 0 \iff |u\times v|=1$$
Given a set of normalized vectors which have pairwise zero dot-product, they make a useful coordinate system (called an orthonormal basis) which makes describing points easy. If $u,v,w$ are normalized and orthogonal vectors in 3-space then every point $p$ can be written
$$p = (p\cdot u)u+(p\cdot v)v+(p\cdot w)w$$
Normalizing a vector is basically the process of preserving the direction of a vector but forgetting its magnitude.
Say I have a vector H.
and I want to produce a vector in the same direction as H but with a length of 10.
What I can do is:
Normalize H (meaning set its length to 1) and the multiply that new vector by 10 to create a vector in the same direction with length 10.
This of course is not the only nor most useful application but it shows the general idea.
Anytime magnitude wants to be forgotten and direction preserved normalization does the trick.
Normalization of a vector also appears in probability where you might have a vector describing states. This normalization is difference since instead of dividing by the length of the vector we are dividing by the sum of its entries. Consider for example flipping 2 fair coins in a row:
After flipping 2 coins in a row you could end up with two Heads, two tails, Heads first then tails, or tails first then Heads:
We can then count the frequency that each state occurs in 4 rounds of flipping two coins:
$$[HH, HT, TH, TT] \rightarrow [1, 1, 1, 1]$$
And if we normalize this vector
$$Norm([1,1,1,1]) = \left[\frac{1}{4},\frac{1}{4},\frac{1}{4},\frac{1}{4} \right]$$
Our vector now appropriately features the probabilities of each state occurring. Suppose you now had an unfair coin that you flipped twice and you repeated the procedure say 29 times to get the following result
$$Frequency([HH, HT, TH, TT]) = [13,11,1,2]$$
To recover our probabilities for ALL the entries we just normalize:
$$\text{Probability} = \left[\frac{13}{29},\frac{11}{29},\frac{1}{29},\frac{2}{29} \right] $$
But these two seemingly different definitions of Normalization ACTUALLY HAVE A LOT IN COMMON. They both capture the concept of: I have a vector, I just want a core piece of information (ex: direction, or probability) and I want to lose other data (ex: magnitude, or frequency count).
So thats why we use it. To standardize our data. Make it fit a 'norm' such as length = 1 in the first case or: sum of entries = 1 in the second case.
Hope that made sense :)