If I have a matrix $X$ with $N$ rows and $D$ columns, then the complexity of $(X^TX)^{-1}$ is $O(ND^2 + D^3)$ because
- $X^TX$ is $O(ND^2)$
- $(X^TX)^{-1}$ is $O(D^3)$ since it is a $D \times D$ matrix.
My question is, why is it $ND^2 + D^3 $ instead of $ ND^2D^3$ ? When do we add / multiply complexities ?
The idea is very intuitive. Say it takes $t_1$ time to do operation 1 and $t_2$ time to do operation 2. How much time does it take to do operation 1 followed by operation 2? Well, it takes $t_1 + t_2$ time.
But of course in mathematics we like to prove our intuition. In this case, it is harder than usual, because multiple variables are involved.
Let $f(N, D$) be the time taken to do operation 1 and $g(N, D)$ be the time taken to do operation 2. By assumption, we have $f \in \mathcal{O}(ND^2)$. By the definition of multiple variables big $\mathcal{O}$, there exists $M_1, C_1 > 0$ such that forall $N, D$ with $N \ge M_1$ or $D \ge M_1$ we have $$f(N, D) \le C_1 ND^2$$ Similarly, since $g \in \mathcal{O}(D^3)$, there exists $M_2, C_2 > 0$ such that forall $N, D$ with $N \ge M_2$ or $D \ge M_2$ we have $$g(N, D) \le C_2 D^3$$ Now our goal is to show the time taken to do operation 1 followed by operation 2 $f + g \in \mathcal{O}(ND^2 + D^3)$. Here we go. Let $M = \max\{M_1, M_2\}$ and $C = \max\{C_1, C_2\}$. Then for all $N, D$ with $N \ge M$ or $D \ge M$ we have $$f(N, D) \le C_1 ND^2 \le C ND^2$$ and $$g(N, D) \le C_2 D^3 \le C D^3$$ Adding the two inequalities gives $$f(N, D) + g(N, D) \le C ND^2 + C D^3 = C(ND^2 + D^3)$$ Therefore, by definition $f + g \in \mathcal{O}(ND^2 + D^3)$ and we win!