I am very puzzled by the definition in the Wiki page. I understand that over a subset $U$ we can assign a sheaf of abelian groups, e.g. some analytic functions over $U$. So we consider that these functions form a nice abelian group (satisfying all requirements). At the same time these functions form a ring in $U$, right? To me these two notions already seem quite redundant. Actually, the ring, I think, contains already more information than the sheaf. So first of all what is the point in defining the sheaf?
Then, we define the sheaf of modules such that each section $\mathcal F(U)$ over the subset $U$ is an $\mathcal O_X(U)$-module. What exactly is this and what additional information does it contain compared to the sheaf being just a sheaf?
Since I am a physicist I try to consider that the underlying space is some "relatively" nice space, e.g. some algebraic variety like the projective plane $\mathbb{CP}^2$. Then, the corresponding section of a subset $U$ would be the section $\mathcal F(U)$ of analytic functions that live on $U$. Am I right that they form the structure sheaf of $\mathbb{CP}^2$? And then, what would the corresponding sheaf of modules be?
Are there any other illuminative examples?
I would say that the orientation sheaf of a topological manifold is a helpful example. It's defined in terms of singular homology, if you have some familiarity with that. It's one way to formulate rigorously what we mean by a consistent choice of orientation at each point (even when there is no smooth structure).
Also look up the connection between analytic continuation in complex analysis and sheaf theory. In that case, arguments that were perhaps formerly ad hoc become simple consequences of covering space theory.