Let $X$ be a $\delta$-hyperbolic space. Let us define an equivalence relation on geodesic rays $\gamma_1, \gamma_2 :[0,\infty) \rightarrow X$ by {$\gamma_1 \sim \gamma_2\,\, \iff d(\gamma_1(t),\gamma_2(t)) < \infty\}$. The Gromov boundary $\partial X$ is defined as the set of all equivalence classes of geodesic rays in $X$.
Consider the upper half plane $\mathbb{H} = \{z \in \mathbb{C} : im(z)>0\}$ equipped with the hyperbolic metric. I have to first show that its Gromov boundary $\partial \mathbb{H}$ is homemorphic to the unit circle $\mathbb{S}^1$, and also show that we can impose a topology on $\bar{\mathbb{H}} = \mathbb{H} \cup \partial \mathbb{H}$ such that $\bar{\mathbb{H}}$ is compact under this topology.
In the textbook, Geometric Group Theory, by Cornelia Druţu and Michael Kapovich, the basis for the topology on $\bar{X} = X \cup \partial X$, where $X$ is a $\delta$-hyperbolic geodesic space is defined as consisting of metric balls $B(z,r)$ for $z \in X$, and neighbourhoods $U_{x,y}(\xi)=\{z \in \bar{X} : [xz] \cap B(y,k) \neq \phi\}$, where $[xz]$ is any geodesic ray joining $x$ to $z$, $y$ is a point on a geodesic ray joining $x$ to $\xi$, $k > 3\delta$, and $\xi \in \partial X$.
In the textbook A Primer on Mapping Class Groups, $\bar{\mathbb{H}}$ is topologized by the basis consisting of: the usual open sets of ${\mathbb{H}}$ plus one open set $U_P$ for each open half-plane $P$ in ${\mathbb{H}}$. A point of ${\mathbb{H}}$ lies in $U_P$ if it lies in $P$ and a point of $\bar{\mathbb{H}}$ lies in $U_P$ if every representative ray $\gamma(t)$ eventually lies in $P$.
I have understood how the Gromov boundary of $\mathbb{H}$ is homeomorphic to $\mathbb{S}^1$, and also understood how an arbitrary hyperbolic space $X$ can be compactified to get $\bar{X}$ with the topology defined above, but I cannot understand the compactification of $\bar{\mathbb{H}}$ with the respective topology and also how this serves as motivation for the general case.
Would appreciate some ideas to push me in the right direction.
You are leaving out one very important feature of $\mathbb H$, namely the Morse lemma, which says (with quantifiers inserted properly): every quasigeodesic has finite Hausdorff distance from a geodesic.
This applies in particular to quasigeodesic rays: every quasigeodesic ray in $\mathbb H$ has finite Hausdorff distance from a geodesic ray. Furthermore, two geodesic rays in $\mathbb H$ that have finite Hausdorff distance are equivalent and hence represent the same point in $\partial\mathbb H$.
It follows that the equivalence relation on geodesic rays that defines $\partial\mathbb H$ can be widened to obtain an equivalence relation on quasigeodesic rays --- namely, finite Hausdorff distance --- which still defines $\partial\mathbb H$.
This fact is useful in studying the large scale geometry of $\mathbb H$. For example, you can use geodesic rays to prove that an isometry of $\mathbb H$ induces a homeomorphism of $\partial\mathbb H$. But in the proof of Mostow rigidity you need more: in one of the first steps of that proof, one uses the quasigeodesic ray definition of $\partial\mathbb H$ and the Morse Lemma to prove that a quasi-isometry of $\mathbb H$ induces a homeomorphism of $\partial\mathbb H$. One might summarize this by saying that Gromov's definition of $\partial\mathbb H$ --- namely, the quasigeodesic ray definition --- is more robust, i.e. it is applicable in broader situations than the geodesic ray definition.
And in the general study of $\delta$-hyperbolic spaces $X$, the Morse lemma is still true, providing a robust, "quasigeodesic ray" definition of $\partial X$, which is behind pretty much everything in our understanding of $\partial X$.