I am currently reading "Methods of homological algebra" by S. I. Gelfand and Yu. I. Manin.
I was wondering if someone could explain the intuition about why $X(id) = id$ and $X(f \circ g) = X(f) \circ X(g)$ imply that to different elements of $X_n(f)$ correspond different simplices and face of a face is a face.

This is "Methods of homological algebra" by S. I. Gelfand and Yu. I. Manin.
This definition amounts to the following: consider the category $\Delta_+$ whose objects are $$[0], [1], [2], [3], \ldots$$ and morphisms $f\colon [m] \to [n]$ are strictly increasing maps $$\{ 0 < 1 < \cdots < m \} \to \{ 0 < 1 < \cdots < n \}.$$ Then what they call "gluing datum" is a contravariant functor $$\tag{*} X\colon \Delta_+^\mathrm{op} \to \mathcal{Set}.$$ Such a thing is better known as a semi-simplicial set. If you consider the bigger category $\Delta \supset \Delta_+$ where the morphisms are nondecreasing maps, then a functor $$X\colon \Delta^\mathrm{op} \to \mathcal{Set}$$ is called a simplicial set.
Specifying (*) is equivalent to specifying
and
And all this should be functorial, i.e. $$X (id) = id, \quad X (g\circ f) = X (f)\circ X (g).$$ The elements of $X_{(n)}$ are called $n$-simplices (this is a standard terminology), and Gelfand and Manin call a face any map $X (f)\colon X_{(n)} \to X_{(m)}$, which intuitively, describes how simplices of dimension $m$ and $n$ are identified/glued.
Now for each $n\in \mathbb{N}$, there is only one morphism $[n] \to [n]$, the identity map $id$, and requiring that $X (id) = id$ intuitively means that in each dimension $n$, the $n$-simplices $X_{(n)}$ are not being identified between themselves.
The condition $X (g\circ f) = X (f)\circ X (g)$ is some kind of transitivity: if you have morphisms $f\colon [\ell] \to [m]$ and $g\colon [m] \to [n]$, then gluing simplices in dimension $n$ and $\ell$ via $X (g\circ f)\colon X_{(n)} \to X_{(\ell)}$ should correspond to gluing via $X (g)\colon X_{(n)}\to X_{(m)}$ followed by gluing via $X (f)\colon X_{(m)} \to X_{(\ell)}$.