I want to better understand how the covariance operator behaves with random vectors. Let's start with standard notation to get an idea what the operator needs to do when multiplying vectors. Say we have:
$$\text E(\sum_{i=1}^{\ n} x_iy_i)= \sum_{i=1}^{\ n} \text E (x_iy_i)=\sum_{i=1}^{\ n} (\text E (x_i) \text E(v_i)+\text {cov}(x_i,y_i))$$
So in this the covariance of vectors would indicate the covariance between each row element of x and y summed.
Now, in vector notation (x and y are random vectors):
$$\text E(x'y) =\text E(x')\text E(y)+\text {cov}(x',y) $$
Question: the covariance in this case needs to be the sum of covariances, since the operation is exactly the same as above. Do I need to augment the covariance sign with extra information, or is this the default behavior when taking the covariance of a vector (Ie. the sign indicate covariance of each element summed)?
Now change the order of operations:
$$\text E(xy')=\text E(x)\text E(y')+\text {cov}(x,y')$$
The covariance has different meaning. Now it's the covariance between all the elements with each element in it's own row/column.
Of course you may also write covariance down using the expectations operator
$$\text {cov}(x',y)=\text E(x'-\text E(x'))(y-\text E(y))$$
which further complicates the issue, since I am not sure if this gives the right answer (the answer has to be sum of the cov between all the elements). You would hope it to be consistent of course...
So the question remains, how is covariance operator notation utilized correctly using the random vector operations? How do I write the 1) sum of covariances, and 2) covariance between each element and 3) covariance between the complete vectors themselves(IE say we have 1,2,3 and 1,2,3 and then calculate covariance based on them) correctly?
The generalization of the covariance between two random variables is the cross-covariance between two random vectors. If $x$ and $y$ are random vectors, not necessarily of the same dimension, you can always form the matrix $xy'$, so it makes sense to define the cross covariance as: $$ \operatorname{Cov}(x,y):=E\left( (x-E(x))(y-E(y))'\right). $$ So yes, the cross-covariance involves matching each component of $x$ with each component of $y$ and computing the (scalar) covariance between the matched components. Algebraically the definition is equivalent to $$ \operatorname{Cov}(x,y):=E(xy') -E(x)E(y'),$$ so your last two displayed equations are correct. However, the covariance operator treats both $x$ and $y$ as column vectors so we don't write $\operatorname{Cov}(x',y)$ or $\operatorname{Cov}(x,y')$; it's always $\operatorname{Cov}(x,y)$.
Your second displayed equation is not the right definition of $\operatorname{Cov}(x,y)$: it is not true that the covariance between two vectors is obtained componentwise, i.e., you can't say $$\operatorname{Cov}(x,y)=\sum_i \operatorname{Cov}(x_i,y_i).\tag1$$ There isn't any simpler form for the RHS of (1); it's just a quantity that arose during calculation of the expectation of a dot product. (Note that the RHS isn't meaningful unless $x$ and $y$ have the same dimension, which is a hint that (1) cannot be the right definition for cross covariance.)