Say we have the function $(1+x)^{-1/2}$.
Using a Taylor Series centered on $x_0=0$, its easy to see that:
$$(1+x)^n\approx1-\frac{1}{2}x+\frac{3}{8}x^2+...\mathcal{O}(x^3)$$
In the above, $\mathcal{O}(x^3)$ just represents higher order terms. After understanding Taylor Series, I understand the above approximation.
However, in many Physics Textbooks, its common-place for the author to replace $x$ with whatever he feels like, and make the same approximation.
For example, in Purcell's E&M, when explaining multi-pole expansions he writes:
However, while reading this, it occurred to me that I have never seen it explained why we can just replace any expression for $x$.
If someone could explain this, I'd really appreciate it! Thanks!
Here, perhaps this will help. Taylor's Theorem says:
$$f(x)\approx f(x_0)+f'(x_0)(x-x_0)+\frac{f''(x_0)(x-x_0)^2}{2}+...\mathcal{O}(x^3))$$
However, if we instead try to substitute in for $x$ some other function, say...$g(x)$, we couldn't just substitute in $g(x)-g(x_0)$ everywhere where there is an $(x-x_0)$ right? Or could we?

Taylor's theorem says that (of course, this is not the most general version of the theorem)
The precise meaning of the $\mathcal{O}$ notation (I know this isn't what you asked, but bear with me) is that the remainder function $\rho_{n,x_0}: I \to \Bbb{R}$, defined by \begin{align} \rho_{n,x_0}(x):= f(x) - \left[f(x_0) + f'(x_0)(x-x_0) + \dots + \dfrac{f^{(n)}(x_0)}{n!}(x-x_0)^n\right] \end{align} satisfies the following condition (this condition gives a quantitative meaning to "the remainder is small")
Note that in all this business, things like $x$ and $x_0$ should be thought of as numbers. Honest to god numbers. So, $f(x)$ is a number! It is no longer a function anymore. $f'(x_0)$ is a number. Something like $f'''(\ddot{\smile})$ is also another number. The reason I keep saying "for all $x \in I$" is that I'm explicitly telling you that for any real number I pick, if that real number lies in the domain, $I$, of the function $f$, then the equations above are true. For example, suppose I take $x_0 = 0$, and suppose that the domain of $f$ is $I = \Bbb{R}$, the whole real line. Then,
And so on. Literally any real number $x$ you think of, as long as the number $x$ lies inside the domain of the function $f$, you can plug it into the above equations and they remains true.
It may seem silly to spend so much time on these simple cases, but that's exactly what we need to do to understand the fundamentals. Now, suppose I have two functions in the game, $f:I_f \to \Bbb{R}$ and $g:I_g \to I_f$, where $I_f, I_g \subset \Bbb{R}$ are intervals in the real line. Now, let's pick a number $x_0 \in I_f$, to "Taylor-expand the function $f$ about". Well, now let's pick ANY number $t \in I_g$. Then, $g(t)$ is a specific real number, which lies inside $I_f$ (the domain of $f$). Now, since $g(t)$ is a real number lying inside the domain of $f$, by Taylor's theorem, I can clearly say: \begin{align} \begin{cases} f(g(t)) &= f(x_0) + f'(x_0)(g(t) - x_0) + \dots + \dfrac{f^{(n)}(x_0)}{n!}(g(t) - x_0)^n + \rho_{n,x_0}(g(t)) \\ |\rho_{n,x_0}(g(t))| & \leq B_n|g(t) - x_0|^{n+1} \end{cases} \end{align}
Here's something to take note of: I'm not saying anything like "f is a function of $x$ or $g$ is a function of $t$" or anything like that, because really such statements are meaningless in this context. All I care about is functions, their domains, and numbers. That's it.
Never EVER get hung up on what letters we use. Math does NOT care what you favourite letter is (forgive the caps... don't think of this as shouting... I really just want to emphasize an obvious fact, which sometimes people seem to forget; I know I sure forget this from time to time). So, don't pay much attention to the fact that I used the letter $t$ instead of $x$. If you want, I can say the following statement, and it says literally the same thing as what I said above:
Just to emphasize once again that symbols shouldn't change the intended meaning, note that the following statement is just as mathematically valid:
One more time just for the sake of fun:
In each of these statements, $t, x, \ddot{\smile}, \#$ were all just names/symbols I gave to specific numbers in the domain $I_g$. Therefore, $g(t), g(x), g(\ddot{\smile}), g(\#)$ are all specific real numbers which lie in $I_f$, which happens to be the domain of $f$.
So, if you're ever in doubt if you can plug something into a function, just ask yourself one very simple question: is the thing I'm about the plug in part of the domain of validity of my function? If the answer is "yes", then of course, you're allowed to plug it in, otherwise, you can't (simply by definition of "domain of a function").
By the way, I know I haven't directly addressed your question about the multipole expansion. The reason is because your problem seemed to be more of a conceptual one understanding the meaning of what one means by substitution (lol I remember being confused by these matters too). Given what I've written so far, I invite you to read through the multipole argument again, and try to convince yourself that the manipulations are all valid. If you still have trouble, then let me know.
Edit: Responding to OP's comments.
The bounding condition on the $n+1$th derivative has nothing to do really with pluging in a number like $g(t)$, because like I mentioned in my first sentence, the theorem stated above is not the most general version. Here's is the version of Taylor's theorem which I first learnt, and which has the weakest hypotheses:
The precise meaning of the little-$o$ notation here is as follows: we first define the "remainder function" $\rho_{n,x_0}: I \to \Bbb{R}$ as before: \begin{align} \rho_{n,x_0}(x):= f(x) - \left[f(x_0) + f'(x_0)(x-x_0) + \dots + \dfrac{f^{(n)}(x_0)}{n!}(x-x_0)^n\right] \end{align} Then, the claim is that \begin{align} \lim_{x \to x_0} \dfrac{\rho_{n,x_0}(x)}{(x-x_0)^n} &= 0. \end{align}
Now, for the sake of notation let me introduce $T_{n,f,x_0}:I \to \Bbb{R}$ to mean the Taylor polynomial of $f$ of of order $n$, based at the point $x_0$. So, we have by definition that $f = T_{n,f,x_0} + \rho_{n,f,x_0}$ (because $\rho_{n,f,x_0}$ is literally defined as $f- T_{n,f,x_0}$).
Notice the differences between this version of the theorem and the previous version:
So, you're right, the $B_n$ is somehow related to the $(n+1)^{th}$ derivative. This form of the bound on the remainder is clearly very good, because if you have a specific function, you can try to estimate an upperbound for the derivative, then you get a really explicit bound on the remainder: $|\rho_{n,x_0}(x)| \leq B_n |x-x_0|^{n+1}$. It tells you literally that the remainder is always smaller than a certain $(n+1)$-order polynomial. And for example, if you take $x= x_0 + 0.1$, then $|\rho_{n,x_0}(x_0 + 0.1)| \leq B_n |0.1|^{n+1}$. If you take a number $x$ which is even closer to $x_0$, then clearly you can make the RHS extremely small, extremely "quickly", because of the power $n+1$.
Anyway, the reason I mentioned this form of Taylor's theorem is to say that regardless of the bound on the $n+1$ derivative, you can always plug in another function's values, $g(t)$, as long as the composition $f \circ g$ makes sense. That's the only restriction you have. More explicitly (with notation very similar to the one above),
This is trivially true, and you don't even need Taylor's theorem for this. Why? Because each equality I wrote above, $:=$ is true by definition (that's why I put the "$:$" infront of "$=$"). Why is it true by definition? Because I first define $T_{n,f,x_0}$ to be a certain function (namely the Taylor polynomial), and then I defined the remainder $\rho_{n,f,x_0}$ to be $f- T_{n,f,x_0}$, so of course it's trivially true that $f = T_{n,f,x_0} + \rho_{n,f,x_0}$. Said another way, all I did is add and subtract the same thing, it is as trival as saying something like $1 = (\pi^e) + (1-\pi^e)$. The non-trivial part is in saying that \begin{align} \lim_{x \to x_0}\dfrac{\rho_{n,f,x_0}(x)}{(x-x_0)^n} &= 0. \end{align} Suppose we have that $g(0) = x_0$. Then, what you should NOT do is make any false inferences like \begin{align} \lim_{t \to 0} \dfrac{\rho_{n,f,x_0}(g(t))}{t^n} &= 0 \end{align}
Anyway, the major conclusion here is that: As long as the composition $f \circ g$ makes sense, I can always write things like $f(g(t))$. And of course, once you think about this for a while, it becomes one of the most obvious things in the world.
Note that what I've been talking so far about is "Taylor's theorem" which deals with "Taylor polynomials", and NOT "Taylor series". A polynomial has a finite sum of terms, while a series is defined a limit of partial sums of finitely many terms. And this is probably more of what you're confused about in your comment.
One is very much tempted to write things like $T_{f,x_0} = \sum_{k=0}^{\infty}\dfrac{f^{(k)}(x_0)}{k!}(x-x_0)^k$, and call is the Taylor series of $f$ around $x_0$, and then say something like $f(x) = T_{f,x_0}(x)$, so that the function $f$ is equal to its Taylor series. But of course, before you can do this, you have to clarify a few things first:
Then, we define $C_{f,x_0} := \{x \in I_f| \, \, \lim_{n \to \infty}T_{n,f,x_0}(x) \text{ exists}\}$. i.e this is the set of points in the domain of $f$ for which the series converges ($C$ for convergence lol) to a (finite) number. Well, we know for sure that $x_0 \in C_f$, because we're simply taking the limit $\lim_{n \to \infty} T_{n,f,x_0}(x_0) = \lim_{n \to \infty}f(x_0) = f(x_0)$. i.e this limit exists. In standard analysis texts, one proves that $C_f$ is actually an interval; i.e if $x \in C_{f,x_0}$, then any number $\xi$ such that $|\xi- x_0| < |x-x_0|$ will also lie in $C_f$, i.e $\xi \in C_{f,x_0}$. This why we call $C_{f,x_0}$ the interval of convergence.
So, as a summary, to write something like $f(x) = T_{f,x_0}(x) = \sum_{k=0}^{\infty}T_{k,f,x_0}(x)$, one has to check two things:
It is only with these two conditions being satisfied that we can say that $f(x) = T_{f,x_0}(x)$.
An example:
Here's a very simple example. Let $I = \Bbb{R} \setminus\{1\}$, and define the function $f: I \to \Bbb{R}$ by \begin{align} f(x) &:= \dfrac{1}{1-x}. \end{align} Then, you can check that $f$ is infinitely differentiable at the origin, and that for every $k \geq 0$, $f^{(k)}(0) = k!$. So, the $n$-th Taylor polynomial for $f$ about the origin is \begin{align} T_{n,f, x_0 = 0}(x) &= \sum_{k=0}^{n} \dfrac{k!}{k!} x^k = \sum_{k=0}^n x^k = \dfrac{1-x^{n+1}}{1-x}. \end{align} Now, it is easy to see that the limit \begin{align} \lim_{n \to \infty} T_{n,f,x_0=0}(x) \end{align} exists if and only if $|x|< 1$ (if this isn;t clear, refer to any standard calculus/analysis text; this will be explained in more detail). Also, it is clear that for $|x|<1$, the limit as $n \to \infty$ is $\dfrac{1}{1-x}$. Thus, we have seen that
i.e it is only for $|x|<1$ that the the Taylor series of $f$ converges, AND actually equals $f$.
For example, let's now define $g: \Bbb{R} \to \Bbb{R}$ by $g(t):= t^2$. Here are a couple of statements we can make which hopefully illustrates the key points:
When can we write down $f(g(t))$? Well, by definition, we can do this if and only if $g(t) \in I_f = \Bbb{R} \setminus \{1\}$. i.e if and only if $g(t) = t^2 \neq 1$. i.e if and only if $t \notin \{-1, 1\}$. Repeating, for every $t \in \Bbb{R} \setminus \{-1,1\}$, we have that $g(t) \in I_f$, so \begin{align} f(g(t)) &= \dfrac{1}{1-g(t)} = \dfrac{1}{1-t^2} \end{align} (this shouldn't be surprising because it is pretty much a review of the definition of composition of functions).
Writing $f(g(1))$ is nonsense, because $g(1) = 1$ is not in the domain of $f$, so it is literally nonsense.
For every $t \in \Bbb{R} \setminus \{-1,1\}$, and every $n \geq 0$, we have that \begin{align} f(g(t)) &= T_{n,f,x_0=0}(g(t)) + \rho_{n,f,x_0=0}(g(t))\\ f(t^2) &= T_{n,f,x_0=0}(t^2) + \rho_{n,f,x_0=0}(t^2) \\ &= \sum_{k=0}^n t^{2k} + \rho_{n,f,x_0=0}(t^2) \end{align} Again, this is simply true by definition of how the remainder $\rho_{n,f,x_0=0}$ is defined (think back to the trivially true equation $1 = (\pi^e) + (1-\pi^e)$). The non-trivial statement (which is exactly the statement made in Taylor's theorem) is that \begin{align} \lim_{x \to 0}\dfrac{\rho_{n,f,x_0}(x)}{x^n} = 0 \end{align}
Another true statement is the following: we have $|g(t)| < 1$ if and only if $|t| < 1$. So, for every real number $t$ such that $|t|<1$, we have \begin{align} \dfrac{1}{1-t^2} &= f(t^2)\\ &= T_{f,x_0=0}(t^2) \tag{since $|t|< 1 \implies |t^2| < 1$}\\ &= \sum_{k=0}^{\infty}(t^2)^k \\ &= \sum_{k=0}^{\infty}t^{2k}. \end{align} Again, at this point don't be confused by the symbols. Everything is a number. $t$ is a number such that $|t|<1$. So, $t^2$ is also a number such that $|t^2| < 1$. So, of course, I can plug it into the Taylor series (which I've shown converges and equals the function $f$ on the interval $(-1,1)$). Again, think of particular numbers. $|0.1|< 1$, so $0.1^2 = 0.01$ clearly satisfies $|0.01|<1$. So, \begin{align} \dfrac{1}{1-0.01} &= f(0.01)\\ &= T_{f,x_0=0}(0.01) \tag{since $|0.01|< 1$}\\ &= \sum_{k=0}^{\infty}(0.01)^k \end{align} When you think of everything as particular numbers (which is exactly how you should think of them anyway), it becomes extremely easy to convince yourself that these manipulations are true.
On a similar note, It is very important to remember that $f(x) = T_{f,x_0=0}(x)$ if and only if $|x| < 1$. This is inspite of the fact that the function $f$ is defined from $\Bbb{R} \setminus\{1\} \to \Bbb{R}$; because the thing is the series on the RHS only converges when $|x| < 1$ (and when this happens it also happens to equal the function $f$). For example, $f(2)$ clearly makes sense, because $2 \in \text{domain}(f) = \Bbb{R} \setminus\{1\}$; also $f(2) = \frac{1}{1-2} = -1$. However, writing something like $T_{f,x_0=0}(2)$ is complete nonsense, because the limit \begin{align} \lim_{n \to \infty}T_{n,f,x_0=0}(2) = \lim_{n \to \infty} \sum_{k=0}^n 2^k = \infty \end{align} is not a (finite) number. i.e the limit doesn't exist in $\Bbb{R}$.
Hopefully these remarks show you what statements you can and can't make in regards to substituting things inside functions. As a summary:
When can I substitute one function's values inside another, like $f(g(t))$? Answer: whenever $t\in \text{domain}(g)$ and $g(t) \in \text{domain}(f)$. (this is literally definition of composition).
The equation $f(x) = T_{n,f,x_0}(x) + \rho_{n,f,x_0}(x)$ is true for every number $x \in \text{domain}(f)$, simply because I defined the terms on the RHS such that this equation is true. (think of this as the $1 = (\pi^e) + (1-\pi^e)$ business).
A completely different question is asking where the Taylor series of a function $f$ converges, and does it equal the function $f$? To answer this question, refer to my discussion above.