I'm taking a course in Discrete Mathematics this summer, and my book doesn't offer a very good explanation of Big-O notation. I understand that if $f(x)$ is $\mathcal O(g(x))$ it means that there exists some constant $C$ and some value $k$ (for $x$) where $C\cdot g(x)$ will be greater than $f(x)$. However, I don't understand how to prove this.
The book example gives the function $f(x) = x^2 + 2x + 1$ is $\mathcal O(x^2)$. My book states that "we observe that we can readily estimate the size of $f(x)$ when $x > 1$ because $x < x^2$ and $1 < x^2$ when $x > 1$. It follows that $0 \le x^2 + 2x + 1 \le x^2 + 2x^2 + x^2 = 4x^2$."
By that example, I see that the constant $C$ is $4$ and $k$ is $1$. However, I don't understand at all why they just decided to throw $x^2$ everywhere and then sum the coefficients to get $4$ as the answer for $C$. Please help!!!
Basically: they did it because it was easy!
The real idea of Big-O notation is to find whatever term gives you the major contribution -- in this case, we know that $x^2$ is much larger than $x$ when $x$ is large -- and bound by it.
They could just as easily have said that when $x\geq 2$, we have $2x\leq x^2$ and $1\leq x^2$, and made the constant 3. The specifics can vary almost as much as you like... and at the end, the value of $C$ is actually of no consequence. But it won't change the fact that when push comes to shove, the rate of growth of this function is on the order of $x^2$.