In Classical Electrodynamics, Jackson derives the electric potential for a surface with a dipole charge.
Here is his derivation. I will omit constants for brevity.
Letting $D(\textbf{x}) := \lim_{d(\textbf{x}) \to 0} \sigma(\textbf{x}) d(\textbf{x})$ where $d(\textbf{x})$ is the local separation of $S$ and $S'$ with $S$ having charge density $\sigma(x)$ and $S'$ having equal and opposite charge density.
The potential due to the two surfaces is:
$$ \phi(\textbf{x}) = \int_S \frac{\sigma(\textbf{x}')}{|\textbf{x} - \textbf{x}'|} da' - \int_{S'} \frac{\sigma(\textbf{x'})}{|\textbf{x} - \textbf{x}' + \textbf{n}d|} da'' \tag{1}$$
where $\textbf{n}$ is the unit normal to the surface $S$ pointing away from $S'$.
He uses a Taylor expansion
$$ \frac{1}{|\textbf{x} + \textbf{a}|} = \frac{1}{x} + \textbf{a} \cdot \nabla \Big( \frac{1}{x} \Big) \tag{2}$$
He says this is valid when $|\textbf{a}| \ll |\textbf{x}|$ (and I assume $x := |\textbf{x}|$). Then as $d \to 0$ (and I believe he redefines $\textbf{x} := \textbf{x} - \textbf{x}'$ and $\textbf{a} := \textbf{n}d$) he arrives at
$$ \phi(\textbf{x}) = \int_S D(\textbf{x}') \textbf{n} \cdot \nabla'\Big( \frac{1}{|\textbf{x} - \textbf{x}'|} \Big) da' \tag{3}$$
and since $\textbf{p} = \textbf{n}\ D\ da'$ then the potential at $\textbf{x}$ caused by a dipole at $\textbf{x}'$ is
$$ \phi(\textbf{x}) = \frac{\textbf{p} \cdot (\textbf{x} - \textbf{x}')}{|\textbf{x} - \textbf{x}'|^3} \tag{4}$$
There are many steps I don't understand:
From (1) Jackson used $\sigma(\textbf{x}')$ at $S$ and $- \sigma(\textbf{x}')$ at $S'$. But, if $\textbf{x}'$ traces out $S$ wouldn't this be starting with the assumption that $S$ and $S'$ are the same surface?
Why is $|\textbf{a}| \ll |\textbf{x}|$ a necessary assumption to use the Taylor expansion? The 1D case would be analogous to expanding the function $1/(x+a)$ and I do not see a reason that $a \ll x$ is necessary to do this.
After substituting the Taylor expansion into (1) (and using $\textbf{x} := \textbf{x} - \textbf{x}'$ and $\textbf{a} := \textbf{n}d$) we get
$$ \phi(\textbf{x}) = \int_S \frac{\sigma(\textbf{x}')}{|\textbf{x} - \textbf{x}'|} da' - \int_{S'} \sigma(\textbf{x'}) \Big( \frac{1}{|\textbf{x} - \textbf{x}'|} + \textbf{n}d \cdot \nabla \Big( \frac{1}{|\textbf{x} - \textbf{x}'|} \Big) \Big) da''$$
$$ \phi(\textbf{x}) = \int_S \frac{\sigma(\textbf{x}')}{|\textbf{x} - \textbf{x}'|} da' - \int_{S'} \sigma(\textbf{x'}) \frac{1}{|\textbf{x} - \textbf{x}'|} da'' - \int_{S'} \sigma(\textbf{x'}) \textbf{n}d \cdot \nabla \Big( \frac{1}{|\textbf{x} - \textbf{x}'|} \Big) da''$$
which somehow reduces to (3). It seems like Jackson cancelled the first two terms but how is this valid when we are integrating over $S$ in one and $S'$ in the other? Also, it seems like Jackson is missing a negative sign from the third term above. Also, the third term above differs from (3) in that he switches from integrating over $S'$ to $S$. Is his change justified because after we do the limiting process the two surfaces coincide, allowing using swap?
EDIT: I just realized these issues (besides the missing negative sign) can be resolved by doing the limiting process first and then expanding the integral into two terms. But this just begs the question; how do we know which order to do these steps when modeling with differentials and limiting processes?
Should $\textbf{n}d$ be $\textbf{n}d(\textbf{x}')$?
I just don't see the jump from (3) to (4) equation.
This should be something like
$$ \frac{1}{|\textbf{x} + \textbf{a}|} = \frac{1}{x} + \textbf{a} \cdot \nabla \Big( \frac{1}{x} \Big) +...$$
but he is silently discarding the second-order terms.
Discarding the second order terms is valid in that case. The Taylor series is always valid, and the above is the standard Taylor expansion of $f(x+a)$. For example see Lang "Calculus of Several Variables" chapter 6.
Your assumption is correct.
It's not really redefining so much as recycling the same symbol $\textbf{x}$ to mean "a general vector". That can be a bit annoying I suppose but it's quite a common practice so you'll just have to be wary about it.
This means that the charge distribution $\sigma$ at the point $x'+nd$ on $S'$ is the same as the charge distribution at the point $x'$ on $S$. That is the reason he's using the normal vector $n$, because $\sigma$ on $S'$ at the point normal to $x'$ on $S$ is the same as the value of $\sigma(x')$ on $S$.
As I mentioned above this is not justifying using the Taylor expansion but just the discarding of the higher-order terms of the expansion. The "dipole moment" of an electric field basically is the second term in the Taylor expansion anyway.
That should be $\nabla'$ which is the gradient with respect to $x'$. See below for why that is important.
By Jackson's third equation, you mean (1.24), right?
The value $x'$ is the same for the two surfaces, so these cancel out. The difference between the two surfaces is in the final term of the integral.
The $\nabla'$ is differentiation with respect to $x'$ so there is a negative sign coming from that, because the denominator is $x-x'$, which is absorbed into Jackson's answer.
$x'$ is the same for either surface, the difference between the surfaces is all in the $nd$ part.
It looks like for this particular calculation $d$ is meant to be constant with respect to $x'$.
Jackson does not derive your equation (4) from (3). That equation is in the book for the purpose of comparision.
It is the dipole moment equation for point charges. Jackson is a graduate-level text, in other words it is for people who've already done undergraduate physics. The electric field of an electric dipole, equation (4) in your question, is a standard part of an undergraduate electricity and magnetism course, so Jackson doesn't derive it.