$\newcommand{\N}{\mathcal{N}}$ $\newcommand{\M}{\mathcal{M}}$ $\newcommand{\g}{\mathfrak{g}}$ $\newcommand{\g}{\mathfrak{h}}$ $\newcommand{\IP}[2]{\left\langle #1,#2 \right\rangle}$ $\newcommand{\Volg}{\operatorname{Vol}_\g}$ Let $\M,\N$ be oriented Riemannian manifolds, and suppose the sectional curvature of $\N$ is negative. (Suppose also $\M$ is compact and connected). Let $\phi:\M \to \N$ be harmonic map.
Is it true that $\phi$ is a local minimizer for the Dirichlet energy? i.e let $\phi_t$ be a smooth variation of $\phi$. Is it true that $E(\phi) \le E(\phi_t) $ for sufficiently small $t>0$?
(The sufficiently small interval $I$ of the "good" $t$'s can depend on the variation of course).
Let us restrict the discussion for variations $\phi_t$ whose variation fields $\left. \frac{\partial\phi_t}{\partial t} \right|_{t=0}$ are not identically zero.
I proved that if the rank of $d\phi$ is everywhere $\ge 2$, then $\phi$ is a local minimizer. (See my proof below).
Also, if $\text{rank }d\phi=0$ everywhere, i.e $\phi$ is constant, then $\phi$ is also a local minimizer (it's a global minimizer...).
So, the question remains: Suppose $\phi$ is a non-constant harmonic map which has at least one point of degree less than $2$. Is $\phi$ a local minimizer?
Proof that $\text{rank }d\phi \ge 2 \Rightarrow$ $\phi$ is locally minimizing:
We first note that $$ \left. \frac{\partial^2}{\partial t^2} E(\phi_{t}) \right|_{t=0}=H_{\phi}(V,V), $$ where $H_{\phi}$ is the hessian of the energy functional at $\phi$, and $V=\left. \frac{\partial\phi_t}{\partial t} \right|_{t=0}$ is the corresponding variation field.
From the second variation formula, we obtain
$$ \begin{split} H(E)_{\phi}(V,V)&= \int_{\M} \IP{J_{\phi}(V)}{V} \Volg \\ &=\int_{\M} -\sum_i \IP{R^{T\N}(V,d\phi(e_i))d\phi(e_i)}{V}+\IP{d_{\nabla^{\phi^*(T\N)}}V}{d_{\nabla^{\phi^*(T\N)}}V} \Volg \\ & \ge -\sum_i \int_{\M} \IP{R^{T\N}(V,d\phi(e_i))d\phi(e_i)}{V}. \end{split} $$
By our assumption (non-zero variation field), there exist $p \in \M$ such that $V_p \neq 0$. If $V_p,d\phi_p(e_i(p))$ are linearly dependent, then since $V_p \neq 0$, $ d\phi_p(e_i(p)) \in \text{span} \{V_p\}$. Since we assumed $\text{rank }(d\phi_p) \ge 2$, not all the $d\phi_p(e_i(p))$ are linearly dependent of $V_p$; Thus, there exist an $1 \le i \le d$ where they are independent, hence $\IP{R^{T\N}(V,d\phi(e_i))d\phi(e_i)}{V} < 0$ by the curvature assumption.
Recall that the sectional curvature of the plane spanned by $x,y \in T_q\N$ is $$ K(x,y)=\frac{\IP{R^{T\N}(x,y)y}{x} }{|x \wedge y|^2}.$$
I will assume that the domain manifold $M$ is compact, otherwise, one needs to be much more careful with the definition of energy and which variations are admissible. However, the same argument works if $M$ is noncompact, but you have to restrict to variations which are trivial outside of a compact subset of $M$. It also works if $N$ is a locally CAT(0) space (the theory was worked out by Jost, Schoen and Korevaar in this setting). But, instead of working with the maps to $N$, you lift to a map between the unuversal covers and work equivariantly.
The key is the following convexity property of energy of maps to manifolds of nonpositive curvature:
Let $f_0, f_1: M\to N$ be smooth and let $f_t$ be a geodesic interpolation between these maps, i.e. for each $x\in M$ the curve $f_t(x)$ is a parameterized geodesic in $N$. Then the energy function $E(f_t)$ is convex as a function of $t$. This convexity property is already present in Eells-Sampson's 1964 paper on harmonic maps (the formula is more-or-less the one that you wrote): $$ \frac{d^2}{dt^2} E(f_t)= 2\int_M \left( \sum_i \left[|\nabla^N_{e_i} V|^2 -K_N(V, (f_t)_*(e_i)) + \langle \nabla^N_{e_i} \nabla^N_{\frac {\partial}{\partial t}} V, (f_t)_*(e_i)\rangle \right] \right)dt. $$ Here $\{e_i(x)\}$ is an orthonormal frame at $T_x(M)$, $V$ is the variational vector field of $f_t$ which we pull-back to $M\times [0,1]$, $\nabla^N$ is the pull-back of the L-C connection on $TN$. The key is that $$ \nabla^N_{\frac {\partial}{\partial t}} V= 0 $$ since $f_t(x), t\in [0,1]$, is geodesic. Hence, the integrand is $\ge 0$.
Then, if $f_0$ is harmonic, for every $f_1$ near $f_0$ we can construct a geodesic interpolation $f_t$ (using compactness of $M$ to ensure that the injectivity radius along $f_0(M)$ is bounded below) and obtain that $\frac{d}{dt}E(f_t)=0$ at $t=0$.
Notice that I am not using the variation $\phi_t$ that you are considering, only $\phi_0=f_0, \phi_1=f_1$.
Convexity of $E(f_t)$ then will imply that $E(f_0)\le E(f_t)$ for all $t$. qed