Suppose we are given an arbitrary metric space $X$ and we want to construct a compact space from this. Equivalent condition for a metric space to be compact is that every sequence has a convergent subsequence. To construct a compact space from $X$ what we do is (at least in case of $\mathbb{R}$ that is what we have done) we consider all sequences and add limit points of these sequences to $X$. So, adding some points to $X$ and giving a topology fixes the possibility of having a sequence which has no convergent subsequence. This gives an injective continuous map $X\rightarrow \tilde{X}$ and we call this $\tilde{X}$ a compactification of $X$.
Suppose we are given an arbitrary group $G$ and we want to construct an abelian group what we do is remove some elements (quotienting by commutator subrgroup $[G,G]$) from $G$ and we get an abealian group $\tilde{G}$ and a map $G\rightarrow \tilde{G}$ and we call this abelianization of $G$.
So, the basic idea is if our set is too big to be something that we want, we remove something from it and if it is too small to be something that we want, we add something to it. I am trying to understand sheafification from this point of view.
Suppose we are given a presheaf $\mathcal{F}$ and we want to construct a sheaf from this. To satisfy gluability condition, it is reasonable to think of adding some sections so that given an open cover $\{V_i\}$ of $U$ and sections $s_i\in \mathcal{F}(V_i)$ such that $s_i|_{V_i\cap V_j}=s_j|_{V_i\cap V_j}$ there exists $s\in \mathcal{F}(U)$ such that $s|_{V_i}=s_i$. I do not know how to think about identity axiom in this point of view. Any suggestion regarding this is welcome.
We define sheafification $\tilde{\mathcal{F}}$ of $\mathcal{F}$ as $\tilde{\mathcal{F}}(U)=\{s:U\rightarrow \bigsqcup \mathcal{F}_p \text{ satisfying some conditions }\}$. I want to know how do I see this as fixing the problem of Gluibility and Identity axiom. Any suggestions are welcome.
You need more fuel for analogy!
Suppose you have a set of vectors. This set is missing some linear combinations of elements! You want to solve the problem of filling in these missing linear combinations, while preserving whatever linear relations already exist between your vectors.
This is called the span of those vectors. We solve this problem in a very different way: these vectors were given to us as elements of some ambient vector space. We construct the span as a subset of that ambient vector space that satisfies the desired property.
The lesson here is that sometimes we already have some ambient notion that the thing we are trying to construct can be described in terms of.
That is closer to what's going on with the definition of sheafification you reference. In nice cases (sheaves on topological spaces count as 'nice'), one of the things we know about the sheafification of a presheaf is that they have the same stalks.
So if we're given a presheaf $P$ and want to find its sheafification $S$, we already know the stalks of $S$. So we can assemble the stalks into a discrete bundle $E$, and then we know that $S$ is (isomorphic to) a subsheaf of the sheaf of sections of $E$. So all we need to do is identify which sections are the members of $S$!
We can actually do better: we can give $E$ a topology that makes it locally homeomorphic to the base space, and it turns out that $S$ will be exactly the sheaf of sections of this bundle. We call this bundle the étale space of the sheaf.
(some sources even define a sheaf to be an étale space, since the category of sheaves on $X$ is equivalent to the category of spaces equipped with local homeomorphisms to $X$)
Incidentally, describing things in terms of stalks solves the identity problem. Hartshorne's "some conditions" presumably can be clearly seen to solve the gluing problem (probably in terms of defining a section as something that can be covered by things). The étale space construction also solves the gluing problem, since sheaves of sections of bundles automatically have the gluing property.