The textbook Numerical Solution of Boundary Value Problems for Ordinary Differential Equations (by Ascher, Mattheij, and Russell) states on page 329:
[W]e observe that Newton's method is affine invariant; i. e., if $$\hat{f}(s)=Bf(s)$$ where $B$ is a constant, nonsingular matrix, then Newton's method yields precisely the same sequence of iterates for the two functions $f$ and $\hat{f}$. (Note that $f$ and $\hat{f}$ have the same zeroes.)
And I've seen pretty much the same definition of affine invariance in other texts.
I get the invariance part: $$\hat{f'}(s)^{-1}\hat{f}(s)=(Bf'(s))^{-1}Bf(s)=f(s)^{-1}f(s).$$ But if $B$ is any non-singular matrix, what is so affine about it? E.g. if we compose $f$ with a translation, then it doesn't seem to work anymore. Why do we call the invariance affine?