Basically I'm trying to find good starting values for algorithms that determine the roots of a polynomial (e.g. newton method). Obviously we are trying to get as close as possible to the root as we can, but how can we estimate where the roots of a polynomial lie?
Is a argument like: "If the coefficients are relatively small compared to the degree of the polynomial, then the magnitude of the roots is somewhere near the coefficients" correct?
Are there counterexamples of polynomials with very small coefficients and very large roots?
There exist estimates for the size of the largest root. The most general go back to the idea that $z$ is not a root of $$ p(z)=a_nz^n+a_{n-1}+...+a_1z+a_0 $$ if $|z|>R>0$ with an outer root radius $R$ that satisfies the intequality $$ |a_n|R^n\ge |a_{n-1}|R^{n-1}+...|a_1|R+|a_0| $$ This polynomial inequality for $R$ is easier to solve numerically than zeroing in on any specific root of the original polynomial. Especially as for the further numerical purposes only a low relative accuracy is needed. The smallest $R$ is obtained as the only positive root of a polynomial with only one sign change in the coefficient sequence, meaning there is exactly one positive root. This situation allows for the secure use of simple scalar root-finding methods like the Newton method.
But one can also obtain simple (over-)estimates like $$ R=\max\left(1,\frac{|a_{n-1}|+...+|a_0|}{|a_n|}\right) $$ or $$ R=1+\frac{\max_{k<n} |a_k|}{|a_n|} $$
These estimates support the general idea, if the coefficients are small relative to the leading coefficient, then the roots will also be small.
The last bounds only give $R\ge 1$. To get beyond that restriction, "guess" a smaller scale $\rho$ and compute the root bound $R_\rho$ from $p_\rho(z)=p(\rho z)$. Then $R=\rho R_\rho$ gives a better bound. $\rho$ can be estimated as power of 2 from the exponent of the coefficients as floating-point numbers. The aim is that the coefficient sequence of $p_\rho$ is about balanced with the leading coefficient dominating and at least one other coefficient of similar magnitude.
I'd recommend studying the techreport to the Jenkins-Traub RPOLY method. I have some of that also reproduced in the corresponding Wikipedia article.