I'm a bit confused as to the difference between $dA$ and $dS$. I understand the semantic difference, and the relation between the two formula-wise. But what is the most basic difference between them?
Is $\int dS$ equal to the area of the shape in question? If that's the case, then what does $\int dA$ give us?
Let me know if I've misunderstood what you're trying to ask. Loosely, $dS$ refers to surface area element of objects which are not necessarily flat, while $dA$ typically refers to flat regions. Let me expand on that a bit. Of course they are related: say that $S$ is a surface, $f(x,y,z)$ is a real-valued function, and you want to compute the integral of $f$ over the surface $S$, $$\iint_S f(x,y,z) \, dS.$$ Then, if you have a parameterization for the surface, say $\textbf{r}(u,v)$ where $(u,v)$ is in some domain $D$, you can compute the surface integral by computing the magnitude of the normal vector to $S$ and integrating $$\iint_S f(x,y,z) \, dS = \iint_D f(\textbf{r}(u,v)) \, |\textbf{r}_u \times \textbf{r}_v | \, dA$$
For instance, if $S$ is the upper hemisphere of the unit sphere $x^2+y^2+z^2=1$, then $S$ can be parameterized by $\textbf{r}(\theta, \varphi) = (\cos\theta\sin\varphi, \,\,\sin\theta\sin\varphi, \,\,\cos\varphi)$ for $(\theta,\varphi)$ in the region $D$ given by $D = \{(\theta,\varphi) \, : \, 0\leq\theta\leq 2\pi, \, 0\leq \varphi \leq \pi/2\}$. Now you can compute the surface integral of some function $f$ via
$$\iint_S f(x,y,z) \, dS = \iint_D f(\cos\theta\sin\varphi, \,\,\sin\theta\sin\varphi, \,\,\cos\varphi) |\textbf{r}_\theta \times \textbf{r}_\varphi | \, dA$$ and the integral on the right hand side can be computed as a standard iterated integral in whichever order is convenient; i.e $$\iint_D dA = \int_0^{2\pi} \int_0^{\pi/2} d\varphi d\theta = \int_0^{\pi/2}\int_0^{2\pi} d\theta d\varphi$$
I leave the details of the computations to you. Note that you are correct in noting that $\iint_S 1 \, dS$ returns the surface area of $S$, and to translate it to an integral involving $dA$ you could use the above procedure. Very roughly speaking, $dA$ is to be used when you have a flat 2-dimensional region of integration as opposed to a surface which lives in 3-dimensions.