The following problem is from the Vector Analysis Chapter 1 of Griffiths' Electrodynamics
1.17 In two dimensions show that the divergence transforms as a scalar under rotations.
A question has been asked about this problem, and there is a solution there. However, I don't completely understand it and I'd like to know specifically why the reasoning below seems not to work.
Let $F$ and $R$ be the vector fields
$$F(x,y)=\langle f_1(x,y),f_2(x,y)\rangle\tag{1}$$
$$R(\overline{x},\overline{y})=\langle R_1(\overline{x},\overline{y}), R_2(\overline{x},\overline{y}) \rangle$$
$$=\langle \overline{x}\cos{\phi}-\overline{y}\sin{\phi}, \overline{x}\sin{\phi}+\overline{y}\cos{\phi} \rangle$$
where $f_1, f_2, R_1$, and $R_2$ are scalar fields and
$$\begin{bmatrix} \overline{x} \\ \overline{y}\end{bmatrix}=\begin{bmatrix} \cos{\phi}&\sin{\phi} \\ -\sin{\phi}&\cos{\phi}\end{bmatrix}\begin{bmatrix}x \\ y\end{bmatrix}\tag{2}$$
$$\begin{bmatrix} \overline{x} \\ \overline{y}\end{bmatrix}=T\begin{bmatrix}x \\ y\end{bmatrix}\tag{3}$$
That is, $\langle \overline{x},\overline{y}\rangle$ are the coordinates of $\langle x,y\rangle$ when we rotate by an angle $\phi$, and this linear transformation has matrix representation $T$.
Note that
$$\begin{bmatrix} x \\ y\end{bmatrix}=\begin{bmatrix} \cos{\phi}&-\sin{\phi} \\ \sin{\phi}&\cos{\phi}\end{bmatrix}\begin{bmatrix}\overline{x} \\ \overline{y}\end{bmatrix}\tag{4}$$
$$\begin{bmatrix} x \\ y\end{bmatrix}=T^{-1}\begin{bmatrix}\overline{x} \\ \overline{y}\end{bmatrix}\tag{5}$$
I think what we want to show is that $\text{div} F(x,y)=\text{div} H(\overline{x},\overline{y})$ where
$$H(\overline{x},\overline{y})=F(R(\overline{x},\overline{y}))\tag{6}$$
That is, $H$ gives us the value of $F$ in a coordinate system in which all previous vectors were rotated by $\phi$. At a point $(\overline{x},\overline{y})$ we expect the divergence of $H$ to remain the same as the divergence of $F$ at a point $(x,y)$.
$$H(\overline{x},\overline{y})=F(R(\overline{x},\overline{y}))=\langle f_1(R(\overline{x},\overline{y})),f_2(R(\overline{x},\overline{y}))\rangle\tag{7}$$
$$\text{div}H(\overline{x},\overline{y})=\nabla\cdot H(\overline{x},\overline{y})=\langle D_1,D_2\rangle\cdot \langle f_1(R(\overline{x},\overline{y})),f_2(R(\overline{x},\overline{y}))\rangle\rangle\tag{8}$$
where $D_i$ is the partial differentiation operator relative to coordinate vector $i$. (I think the issue might be in this step (8)).
$$=D_1f_1(R(\overline{x},\overline{y}))+D_2f_2(R(\overline{x},\overline{y}))\tag{9}$$
At this point I am not sure what the best way to compute these partial derivatives is, but I think I can use the total derivative of a composition as in
$$=Df_1(R(\overline{x},\overline{y}))\cdot DR(\overline{x},\overline{y})\cdot\langle 1,0\rangle+Df_2(R(\overline{x},\overline{y}))\cdot DR(\overline{x},\overline{y})\cdot\langle 0,1\rangle$$
where the notation $D$ means total derivative, which
$$=\begin{bmatrix} D_1f_1(R(\overline{x},\overline{y})) & D_2f_1(R(\overline{x},\overline{y}))\end{bmatrix}\begin{bmatrix}\frac{\partial R_1}{\partial\overline{x}} & \frac{\partial R_1}{\partial\overline{y}} \\ \frac{\partial R_2}{\partial\overline{x}}& \frac{\partial R_2}{\partial\overline{y}}\end{bmatrix}\begin{bmatrix}1 \\ 0\end{bmatrix}$$
$$+\begin{bmatrix} D_1f_2(R(\overline{x},\overline{y})) & D_2f_2(R(\overline{x},\overline{y}))\end{bmatrix}\begin{bmatrix}\frac{\partial R_1}{\partial\overline{x}} & \frac{\partial R_1}{\partial\overline{y}} \\ \frac{\partial R_2}{\partial\overline{x}}& \frac{\partial R_2}{\partial\overline{y}}\end{bmatrix}\begin{bmatrix}0 \\ 1\end{bmatrix}$$
$$=\begin{bmatrix} D_1f_1(R(\overline{x},\overline{y})) & D_2f_1(R(\overline{x},\overline{y}))\end{bmatrix}\begin{bmatrix}\cos{\phi} & -\sin{\phi} \\ \sin{\phi} & \cos{\phi}\end{bmatrix}\begin{bmatrix}1 \\ 0\end{bmatrix}+\begin{bmatrix} D_1f_2(R(\overline{x},\overline{y})) & D_2f_2(R(\overline{x},\overline{y}))\end{bmatrix}\begin{bmatrix}\cos{\phi} & -\sin{\phi} \\ \sin{\phi} & \cos{\phi}\end{bmatrix}\begin{bmatrix}0 \\ 1\end{bmatrix}$$
$$=\begin{bmatrix} D_1f_1(R(\overline{x},\overline{y}))\cos{\phi}+ D_2f_1(R(\overline{x},\overline{y}))\sin{\phi} \end{bmatrix}+\begin{bmatrix} -D_1f_2(R(\overline{x},\overline{y}))\sin{\phi}+ D_2f_2(R(\overline{x},\overline{y}))\cos{\phi} \end{bmatrix}$$
$$=\cos{\phi}(D_1f_1(R(\overline{x},\overline{y}))+D_2f_2(R(\overline{x},\overline{y})))+\sin{\phi}(D_2f_1(R(\overline{x},\overline{y}))-D_1f_2(R(\overline{x},\overline{y})))$$
You’re proving the wrong thing. The correct setup is as follows: let $U,V\subset\Bbb{R}^n$ be open sets, let $F:U\to\Bbb{R}^n$ be a $C^1$ vector field on $U$, and let $\Phi:U\to V$ be a diffeomorphism. Define a new vector field $H$ on $V$ as follows: \begin{align} H(y):=D\Phi_{\Phi^{-1}(y)}[F(\Phi^{-1}(y))]. \end{align} Here, I’m using the notation $DG_a$ to mean the (Frechet) derivative at a point $a$ of the map $G$. In Matrix language (which is a can of worms by itself), this is saying you take the matrix $D\Phi_{\Phi^{-1}(y)}$, i.e the derivative of $\Phi$ at the point $\Phi^{-1}(y)$, and you multiply it by the column vector $F(\Phi^{-1}(y))$. In differential geometry jargon, this is the push-forward of the vector field $F$ by the map $\Phi$ (i.e you take a vector field $F$ defined on $U$ and you ‘push it forward’ to a vector field defined $H=\Phi_*F$ on $V$).
To make contact with your notation, you’re working with $n=2$, and your rotation $R$ (thought of as a linear transformation on $\Bbb{R}^2$) is my $\Phi^{-1}$. What you’re missing is the extra composition with $D\Phi_{\Phi^{-1}(y)}$ on the left. Without it, it’s like you haven’t modified the target space of the function $F$ to rewrite it in the rotated coordinates. Now, for simplicity, suppose $\Phi=A$ is an invertible linear map. Then, the definition of $H$ simplifies (because linear maps are their own derivatives at each point) to \begin{align} H&:=A_*F= A\circ F\circ A^{-1}. \end{align} Now, you can calculate easily using the chain rule that for each $y\in V$, \begin{align} DH_y&=A\circ DF_{A^{-1}(y)}\circ A^{-1}, \end{align} and so by taking traces (which is how divergence is defined in these linear coordinates) and using the cyclic property of traces, we get \begin{align} \text{div}(H)&=[\text{div}(F)]\circ A^{-1}, \end{align} or rearranging, $[\text{div}(H)]\circ A=\text{div} F$. This is what you were meant to show
Remarks.
Strictly speaking, the general definition of the divergence is not the trace of the (Frechet) derivative, which is why I had to make the unnatural assumption of making $\Phi=A$ be a linear map. More generally, note that by the chain rule and ‘product’ rule, we have \begin{align} DH_y(\cdot)&=D^{2}\Phi_{\Phi^{-1}(y)}\left[D(\Phi^{-1})_y(\cdot), F(\Phi^{-1})(y)\right] + D\Phi_{\Phi^{-1}(y)}\left[DF_{\Phi^{-1}(y)}[D(\Phi^{-1})_y(\cdot)]\right]. \end{align} It is only if we now specialize to the case that $\Phi=A$ is linear, then the second derivative vanishes, and we can write the above equality as \begin{align} DH_y&=0+ A\circ DF_{A^{-1}(y)}\circ A^{-1}. \end{align} Now, you see algebraically where the linearity comes into play. But abstractly/differential geometrically, the correct way to think of the divergence of a vector field is with respect to a scalar density (or a volume form). This may be a little too much, so I’ll stop here.