In chess, a normal knight goes two steps forward and one step to the side, in some orientation. Thanic thought that he should spice the game up a bit, so he introduced a new kind of piece called a warrior. A warrior can either go three steps forward and one step to the side, or two steps forward and two steps to the side in some orientation.
Given a $2020\times2020$ chess board. Find, with proof, the maximum number of warriors one can put on its cells such that no two warriors attack each other.
The question is a modified version of a problem from Bangladesh Mathematical Olympiad 2019. For more clarity, here is a picture that shows example moves of a warrior:

This is my first time solving this kind of problem. I've made the following progress in solving the question:
We place the warriors in each cell of $n$-th column where $n\equiv1\ (\bmod 4)$. The following picture shows this strategy in an $8\times8$ board:

It can be seen that no two warriors can
attack each other. Hence, the answer to our original problem should be $2020\times505$.
Though this result matches with the original answer, I have still some confusions. Firstly, the optimal strategy is that in the $2020\times2020$ board, we place a warrior in each cell of $n$-th column. But what if we don't place them with that strategy or we just randomly place the warriors so that they cannot attack each other? How will I know other strategies would not give a result greater than $2020\times 505$? More specifically, how do I write a formal proof for this kind of problems?
$505\times 2020=1,\!020,\!100$ is certainly not optimal. By tiling a $2019\times 2020$ board with a rectangular pattern of the following $3\times 5$ rectangle, you can fit $$4\times \frac{2019}3\times \frac{2020}{5}=1,\!087,\!568\newcommand{\W}{\mathsf{W}}$$ warriors onto a $2019\times 2020$ board alone. $$ \begin{array}{|c|c|c|c|c|} \hline &&\;\W\;&&\phantom{\Big(\;\;} \\\hline &\;\W\,&&\;\W\;&\phantom{\Big(\;\;\;\,} \\\hline \phantom{\Big(\;\;\;}&&\W&& \\\hline \end{array} $$ You strategy achieves a density of $1/4$ warriors/square, while mine has a density of $4/15$, so the latter should always be better for large enough boards. I do not know if this can be improved at all.
Let $D$ be the optimal packing density for warriors. In addition to the lower bound of $D\ge 4/15$, I can prove the upper bound $D\le 1/3$.
For each warrior, imagine placing a token on the $12$ squares that the warrior can attack. Some squares will have multiple tokens. However, you can show that every square will have at most $6$ tokens. Indeed, for any unoccupied square $\mathsf X$, if we partition the $12$ squares that can attack $\mathsf X$ into $6$ attacking pairs as shown in this table, (pairs are labeled $\mathsf A$ through $\mathsf F$), then we see that $\mathsf X$ can be attacked from at most one square in each pair. $$ \begin{array}{|c|c|c|c|c|c|c|} \hline & &\mathsf C& &\mathsf D& & \\\hline &\mathsf B& & & &\mathsf C& \\\hline \mathsf A& & & & & &\mathsf D \\\hline & & &\mathsf X& & & \\\hline \mathsf B& & & & & &\mathsf E \\\hline &\mathsf A& & & &\mathsf F& \\\hline & &\mathsf F& &\mathsf E& & \\\hline \end{array} $$
This means that each warrior effectively occupies $1+12\times \frac16=3$ squares, so you can have no more than $1/3$ warriors per square.
This is only a "long-run" result, since warriors at the boundary of a grid will place fewer than $12$ tokens. However, this effect is negligible in the long run.