Let $f:\mathbb{R^n}\to\mathbb{R}$ be a continuously differentiable function and let $\nabla f$ be its gradient. I am interested in approximating this gradient around some point $x_0$ using a Taylor expansion such that $$ \nabla f(x)=\nabla f(x_0)+\nabla^2 f(x_0)(x-x_0)+O\left(||x-x_0||^2\right) $$ where $||x-x_0||$ is some norm of $x-x_0$, and where $\nabla^2 f$ is the Hessian matrix of $f$. Is this expansion correct? Does it matter which norm I use?
If the expansion is correct, how should we think of the vector $O\left(||x-x_0||^2\right)$? If $n=1$, this notation means that there exists $M>0$ and $\delta>0$ such that $$\left|O\left(||x-x_0||^2\right)\right|\leq M||x-x_0||^2$$ for all $|x-x_0|<\delta$. But how does this work when $n>1$? Does that condition hold element by element? If so, are the $M$'s and $\delta$'s the same across elements?
EDIT: Fixed the expansion