In my textbook, it is stated that $\nabla\times(\nabla\times A)=\nabla(\nabla\cdot A)-\nabla^2A $
So, I thought that del can be treated as if it were a vector (although it was an operator). However, When solving $\nabla\times (k\times r)$ where $r= x \hat{i}+y\hat{j}+z\hat{k} $, the results I obtained from using the BAC CAB rule (while treating $\nabla$ as a constant) was different from the results obtained by doing it in the normal order.
Why is this the case? Is it because, since $\nabla$ is an 'operator', you can't use it like a vector, and therefore cannot use the BAC CAB rule? But in the textbook, it uses the BAC CAB rule as the following: $$\nabla\times(\nabla\times A)=\nabla(\nabla\cdot A)-\nabla^2A $$
So does this mean that the textbook's derivation of $\nabla\times(\nabla\times A)$ was incorrect, and I should use Levi-Civita definition of cross products?
In summary, my question is: when is it safe to regard $\nabla$ as a vector?
To the extent that the BAC─CAB rule makes sense for $\nabla\times(\nabla \times\vec A)$, it does hold. In particular, if you look at it in component notation, i.e. if you compare the BAC─CAB rule $$ \left[ \vec a \times(\vec b\times \vec c)\right ]_i = b_i \,c_j a_j - c_i b_j a_j $$ with the double curl, $$ \left[ \nabla\times(\nabla \times\vec A) \right]_i = \partial_i(\partial_j A_j) - \partial_i \partial_i A_j, $$ where I've used Einstein summations and simplified partial derivatives to $\partial_i = \frac{\partial}{\partial x_i}$, then you get exactly the same structure.
The only difference now is that you need to be very careful with how you place things: where before you could place the real numbers $a_i$, $b_j$ and $c_k$ in any order you wanted, you you need to have all the $\partial_i$'s before the $A_j$'s for the expression to make sense. Once you do, though, everything works out as expected.
On the other hand, it's important to note that the $\nabla$ operator is still an operator, and that vector-calculus identities should never be assumed to hold without specific confirmation. (As an example, the BAC─CAB rule definitely doesn't hold for $\nabla \times (\vec A \times \vec B)$; instead, a version of it does hold, but you need to include two copies, one with the derivatives acting on $\vec A$ and one with the derivatives acting on $\vec B$.)
More generally, it really is easiest to work in component notation. In this case, for example, I would argue that the "true" (or, at least, the most useful) instantiation of the BAC─CAB rule is the (pretty trivial*) identity $$ \epsilon_{ijk}\epsilon_{klm} = \delta_{il}\delta_{jm} - \delta_{im}\delta_{jl} $$ between the Levi-Civita symbol and the Kronecker delta. Why? because the Levi-Civita tensor is the core of both cross products $$ \left[ \vec a \times \vec b \right]_i = \epsilon_{ijk} a_j b_k $$ as well as curls, $$ \left[ \nabla \times \vec A \right]_i = \epsilon_{ijk} \partial_j A_k, $$ so the left-hand side of $\epsilon_{ijk}\epsilon_{klm} = \delta_{il}\delta_{jm} - \delta_{im}\delta_{jl}$ gives you a concatenation of two cross-producty objects, while the right-hand side has two contractions: one of which becomes an inner product and the other of which transfers one internal index to the externally-queried $i$. Moreover, this makes it extremely simple to work through the tensor algebra, regardless of whether it is simple vectors at play or a more complicated calculus identity: thus, for the double curl you get \begin{align} \left[ \nabla\times(\nabla \times\vec A) \right]_i & = \epsilon_{ijk} \partial_j \left[ \nabla \times\vec A \right]_k \\ & = \epsilon_{ijk}\epsilon_{klm} \partial_j \partial_l A_m \\ & = (\delta_{il}\delta_{jm} - \delta_{im}\delta_{jl}) \partial_j \partial_l A_m \\ & = \partial_j\partial_i A_j - \partial_j \partial_j A_i \\ & = \partial_i(\partial_j A_j) - \partial_j \partial_j A_i \\ & = \left[ \nabla (\nabla \cdot \vec A) - \nabla^2 \vec A \right]_i \end{align} as usual, but for the curl of a cross product it allows you to do much the same while still making it clear, at the stage \begin{align} \left[ \nabla\times(\vec A \times\vec B) \right]_i & = \epsilon_{ijk} \partial_j \left[ \vec A \times\vec B \right]_k \\ & = \epsilon_{ijk}\epsilon_{klm} \partial_j \left[ A_l B_m \right] \end{align} that the partial derivative needs to act in a product-rule kind of way to give $$ \partial_j \left[ A_l B_m \right] = B_m \, \partial_j A_l + A_l \, \partial_j B_m, $$ a product that can then be fed into the rest of the calculation.
Hopefully that will help in conceptualizing the result and placing it inside a more coherent context.
* Why trivial? Because the combinations that contribute to the Levi-Civita symbol with nonzero weight are very restricted. Basically, for $\epsilon_{ijk}$ to count at all, you need $i\neq j$, and that fixes $k$ to be whatever $i$ and $j$ are not, i.e. if $i=1$ and $j=2$ then the only $k$ that survives the summation is $k=3$. Then, to do the product, you first use the cyclic property on the second term, $\epsilon_{ijk}\epsilon_{klm} = \epsilon_{ijk}\epsilon_{lmk}$, and since $k$ is now fixed, the set $\{l,m\}$ needs to be the same as $\{i,j\}$, and this can only happen (a) if $i=l$ and $j=m$, in which case $\epsilon_{ijk}$ and $\epsilon_{klm}$ have matching signs, or (b) if the pairs $(i,j)$ and $(l,m)$ are swapped, in which case you accrue a minus sign. That train of thought is enough to reconstruct the full form of the symbol identity simply by matching indices on $\epsilon_{ijk}\epsilon_{lmk}$.