The following is part of Problem 12 in Milnor's Topology from the Differentiable Viewpoint. Some context is needed from Problem 11, which I will give here. Let $M$ be a manifold embedded in $\mathbb{R}^k$.
From Problem 11: The normal bundle space $$E=\{(x,v)\in M\times \mathbb{R}^k\;|\;v\perp TM_x\}$$ is a smooth manifold. If $M$ is compact and boundaryless, then the correspondence $$(x,v)\mapsto x+v$$ from $E$ to $\mathbb{R}^k$ maps some $\epsilon$-neighborhood of $M\times 0$ in $E$ diffeomorphically onto the $\epsilon$-neighborhood $N_\epsilon$ of $M$ in $\mathbb{R}^k$.
Problem 12: Define $r:N_\epsilon\to M$ by $r(x+v)=x$. Show that $r(x+v)$ is closer to $x+v$ than any other point of $M$.
I have done the later part of the Problem 12, my question is how to do the part I have stated here. Intuitively, this makes perfect sense, but I am not sure hot to justify the statement by using what is either given or presupposed in the book. Any help would be appreciated.
Let me name some of the things in your post:
Suppose that there exists a point in $M$ which is closer to $x+v$ than $x=r(x+v)$ itself. By compactness of $M$, there exists a point $p \in M$ which minimizes the distance to $x+v$, and it follows that $$d(x+v,p) < d(x+v,r(x+v)) = |v| $$ Consider the segment $\overline{x+v,p}$. Because $p \in M$ minimizes the distance to $x+v$, this segment meets $M$ orthogonally at the point $p$. Geometrically this is kind of obvious. For a rigorous proof using calculus, just apply the Lagrange multiplier method of constrained optimization.
It follows that $$f(x,v) = x+v = p+w = f(p,w) $$ for some vector $w$ such that $w \perp T_p M$ and $$|w| = d(p+w,p) = d(x+v,p) < |v| < \epsilon $$ Thus both of $(x,v)$ and $(p,w)$ are contained in $E_\epsilon$, hence $f$ is not one-to-one, a contradiction.