Look at this part:
Define the vector $p = -\nabla f(x^*)$ and note that $p^T\nabla f(x^*) = -||\nabla f(x^*)||^2 <0$. Because $f$ is continuous near $x^*$, there is a scalar $T>0$ such that
$p^T\nabla f(x^*+tp) <0, \forall t\in [0,T]$
Why the continuity of the gradient imply that? I understand that because the gradient is continuous, we can move around smoothly and retain the signal. But I'd suppose it works for $\nabla f$ only. Why it works for $p^T\nabla f(x^*+tp)$?
Also, what if I chose $p = \nabla f(x^*)$ instead of the negative?

I think what you're missing is the following fact: if $F:\mathbb{R}^n\rightarrow\mathbb{R}$ is continuous and satisfies $F(x_0)<0$ then there exists some $\delta>0$ such that $F(x)<0$ for all $x$ such that $|x-x_0|<\delta$. You should try to prove this from the limit definition of continuity. In your particular example, $F$ is the continuous function $p^T\nabla f(x)$ so by the fact above we know that $p^T\nabla f(x)<0$ for all $|x-x^*|<\delta$ for some $\delta$. We then just choose $T$ so that $|x+tp-x^*|<\delta$ whenever $t<T$.