The following is exercise 2.12 in Hartshorne's Algebraic Geometry, from chapter II.
This cocycle condition seems somewhat bizarre to me and I am trying to figure out why he constructed it as such. Why does $\varphi_{ij}$ map $U_{ij}\cap U_{ik}$ isomorphically onto $U_{ji}\cap U_{jk}$? It seems unnatural to insist that these two be identified. I don't see what this has to do with anything. If you compare the other cocycle condition, for gluing sheaves, (see below - this other exercise I am fine with) he restricts to agreement on triple intersections. What I would expect, then, for 2.12, is a cocycle condition which says that $\varphi_{ik}$ restricted to $U_{ij}\cap U_{ik}\cap \varphi_{ij}^{-1}(U_{ji}\cap U_{jk})$ should agree with $\varphi_{jk}\circ \varphi_{ij}$ on the same domain. Here's a picture of my conception of what the cocycle condition should be:
I am having trouble even drawing a reasonable picture of what Hartshorne is talking about. Does anyone have a suggestion for intuition about why Hartshorne would do this?



If you want to glue a collection of topological spaces $\{X_\alpha\}_{\alpha\in A}$, what do you do? You define an equivalence relation on the points of the disjoint union $\coprod_{\alpha\in A} X_\alpha$ which identifies the points you glue together to get the underlying set, and then you construct the topology. An equivalence relation is reflexive, symmetric, and transitive, so you need to ensure your gluing conditions give such a thing. Reflexivity is clear, the condition that $\varphi_{ij}=\varphi_{ji}^{-1}$ gives symmetry, and the cocycle condition gives transitivity - if $(x_0,X_0)\sim (x_1,X_1)$ by $\varphi_{01}$ and $(x_1,X_1)\sim (x_2,X_2)$ by $\varphi_{12}$, then the condition that $\varphi_{12}\circ \varphi_{01}=\varphi_{02}$ on $X_0\cap X_1\cap X_2$ exactly gives that $(x_0,X_0)\sim (x_2,X_2)$. This is exactly what you need to glue your topological spaces together. Once you actually glue the topological spaces in exercise II.2.12, the cocycle condition turns in to the cocycle condition for sheaves, and then you can apply exercise II.1.22, which is your second excerpted image.
Your complaint about expecting the condition to be written differently doesn't really seem like much of a complaint to me: in fact, $\varphi_{ij}^{-1}(U_{ji}\cap U_{jk})=U_{ij}\cap U_{ik}$ under the hypotheses here.
As for intuition, the reason Hartshorne requires these things is that this is how you glue stuff together! There are slightly different presentations in Stacks or Vakil exercise 4.4.A, but the theme is the same: you have to glue your topological spaces, and then you glue your sheaves. To glue topological spaces, you can think of stacking your disjoint union of $X_i$ vertically over your final target space so that all the points which glue together are above the point that they glue to, and then operating a big press which smushes everything together. The conditions we require on gluing a topological space are necessary to make sure everything "lines up" and the press can actually do it's job.