If anybody could help explain some parts of this discussion I would greatly appreciate it, especially since it seems like every other source on spectral sequences seems to use completely different notation and ideas.

My main confusion stems from the statement "the map $i$ need no longer be inclusion". What is this map then? How is it not inclusion? Isn't $i = i_1$ just the induced map from inclusion on cohomology? I understand that this doesn't necessarily give inclusion of the cohomology groups, but then why does it become inclusion after applying it, i.e. why is $iH(K_1) \hookrightarrow H(K)$ inclusion but $H(K_1) \to H(K)$ not?
Also when they write $H(A)$, I don't quite understand what this means, especially if the original complex $K$ is not graded! In my mind $H(A) = \oplus_k H^k(A) = \oplus_k H^k(\oplus_p K_p) = \oplus_k \oplus_p H^k(K_p)$ but what is $H^k(K_p)$ if there is no grading on $K$?

The map $i$ is, indeed, the map induced by $i_1$ on cohomology, usually denoted $(i_1)^*$. As you know, even if $i_1$ is an inclusion, its induced map need not be.
It does not "become an inclusion after applying it". The space $iH(K_1)$ is the image of $i$ (maybe you are used to a notation with parentheses, $i(H(K_1))$. It is a subspace of $H(K)$, and the map $iH(K_1) \to H(K)$ is simply the inclusion of this subspace. It is not the map $i$ anymore.
If the complex $K$ is graded, then $H(K) = \bigoplus_k H^k(K)$. If the complex is not graded, then $H(K)$ is simply the quotient $\ker(d) / \operatorname{im}(d)$ where $d : K \to K$ is the differential (which is simply a linear map such that $d \circ d = 0$). You are right that it has no grading if $K$ is not graded to start with, so $H^k(K)$ is not something that would make sense, only $H(K)$.
Note that there is a little confusion in your question. With the notation of Bott & Tu, $K_p$ is not a direct summand of $K$, it is a subcomplex and $K$ (and $K$ is filtered: $K = K_0 \supset K_1 \supset K_2 \supset \dots$). So the cohomology $H(K)$ is not $\bigoplus_{k,p} H^k(K_p)$, that would be way too big, there would be redundancy, some terms would appear several times.