I am trying to prove that for some language, S, $$ S^* = S^* S^*$$ Logically, it makes sense to me but I am not sure if my reasoning is rigorous enough.
We know $ S^* $ is given by the Kleene closure definition $$ S^* = S^0 \cup S^1 \cup S^2 \cup....\cup S^n \cup....$$ where $n \geq 0$.
Multiplying this by $S^*$, we obtain- $$ S^* S^*= S^*S^0 \cup S^*S^1 \cup S^*S^2 \cup....\cup S^*S^n \cup....$$ $$ S^* S^*= (S^0 \cup S^1 \cup S^2 \cup....\cup S^n \cup....)S^0 \cup (S^0 \cup S^1 \cup S^2 \cup....\cup S^n \cup....)S^1 \cup (S^0 \cup S^1 \cup S^2 \cup....\cup S^n \cup....)S^2 \cup....\cup (S^0 \cup S^1 \cup S^2 \cup....\cup S^n \cup....)S^n \cup....$$ Which evaluates to: $$S^*S^* = S^0 \cup S^1 \cup S^2 \cup....\cup S^n \cup....$$
I am just trying to see if there is a better method to prove this statement, or if I made any incorrect mathematical assumptions. Thanks!
A rewriting of your approach to not use "$\dots$" would look like: $$S^*S^* = S^*\bigcup_{i=0}^\infty S^i = \bigcup_{i=0}^\infty S^*S^i = \bigcup_{i=0}^\infty\bigcup_{j=0}^\infty S^j S^i = \bigcup_{i=0}^\infty\bigcup_{j=0}^\infty S^{j+i} = \bigcup_{n=0}^\infty S^n = S^*$$ The step going from $\bigcup_{i=0}^\infty\bigcup_{j=0}^\infty S^{j+i}$ to $\bigcup_{n=0}^\infty S^n$ is slightly non-trivial and takes an argument which is what you omitted.
Alternatively, you can use the laws of Kleene algebra and prove the result with a bit more generality. A defining property of a star semiring in general is (continuing to use set-theoretic notation): $S^* = S^0 \cup SS^* = S^0 \cup S^*S$. From this $S^* \subseteq S^*S^*$ is easily shown, $$S^* \subseteq S^*\cup SS^*S^* = (S^0\cup SS^*)S^* = S^*S^*$$ The other direction uses the Kleene algebra property that $ax \leq x \implies a^*x \leq x$. We set $a = S$ and $x = S^*$ which will give us $S^*S^*\subseteq S^*$ if we can show $SS^* \subseteq S^*$ but that follows trivially using, again, $S^*=S^0\cup SS^*$. Star semirings are fairly common and rather handy, so it doesn't hurt to get familiar with their manipulation.
A third approach is to note that $S^*$ is the smallest set, $X$, such that $X = S^0\cup SX$. This is an inductive definition and gives us a rule of induction over $S^*$. Writing $\epsilon$ for the empty sequence (so $S^0 = \{\epsilon\}$), we have: if for some predicate $P$, $P(\epsilon)$ holds and for all $\bar x\in S^*$ and $s\in S$, $P(\bar x)$ implies $P(s\bar x)$, then $P$ holds for all elements of $S^*$. As a rule: $$\cfrac{P(\epsilon)\qquad\forall \bar x\in S^*, s\in S.P(\bar x) \Rightarrow P(s\bar x)}{\forall \bar x\in S^*.P(\bar x)}$$ or set-theoretically: $$\cfrac{\epsilon\in P\qquad\forall \bar x\in S^*, s\in S.\bar x\in P \Rightarrow s\bar x \in P}{\forall \bar x\in S^*.\bar x \in P}$$ and simplified (at which point the connection to the "smallest set such that..." becomes clear): $$\cfrac{\epsilon\in P\qquad SP \subseteq P}{S^*\subseteq P}$$ Now we can go in one of two directions. One direction is to use this induction rule to prove $S^* \subseteq S^*S^*$, i.e. $\forall \bar x\in S^*.\bar x\in S^*S^*$, and $S^*S^*\subseteq S^*$, i.e. $\forall \bar x\in S^*, \bar y \in S^*.\bar x \bar y \in S^*$ (and this latter corresponds to an element-wise view of the omitted step in the first proof). Another direction, inspired by category theory, is to view the induction rule as a universal property and show that $S^*S^*$ also satisfies the universal property, i.e. the rule: $$\cfrac{P(\epsilon)\qquad\forall \bar x\in S^*S^*, s\in S.P(\bar x) \Rightarrow P(s\bar x)}{\forall \bar x\in S^*S^*.P(\bar x)}$$ holds. In the set-theoretic view, this is the statement that $S^*S^*$ is also the smallest set, $X$, such that $X = S^0\cup SX$. Unsurprisingly, doing this mainly reduces to proving a statement very much similar to the $S^*S^* \subseteq S^*$ one. Indeed, you could just prove this statement and then use minimality of $S^*$.
The second approach is the one I'd most prefer because of the generality of the result, the minimal requirements on the ambient logic, and the simple algebraic manipulation form of the proof. The third approach also starts to loosen the requirements of the ambient logic, and presents $S^*$ in arguably a slightly more natural manner (namely as cons-lists, i.e. like the algebraic data type
data List S = Empty | Append S (List S)to use Haskell syntax).