I am struggling with this problem of philosophy of mathematics, that is to apply the axiomatic method to give foundations to the theory of formal languages, without mention explicitly the concepts of "set", "order", "number", "finitude" etc. The axioms may imply implicitly these features (and they absolutely will), but not to use them. The task is to create the theory of formal languages like no previous mathematics (formal or conceptual) exists. My idea is introducing axioms that constructs an universe of things called "strings" that behaves like a free semigroup. I got stuck on the problem of introducing an axiom that imply the freeness of the "semigroup" of strings.
I begun with the primitive notions of "string" (word) and "concatenation" (juxtaposition) and the semigroup axioms:
(A1) For each pair of strings $\alpha$ and $\beta$, there is a unique string (denoted by "$\alpha \beta$") called "concatenation of $\alpha$ followed by $\beta$".
(A2) For any strings $\alpha$, $\beta$ and $\gamma$, if $\delta$ is $\alpha \beta$ and $\varepsilon$ is $\beta \gamma$, then $\delta \gamma$ is $\alpha \varepsilon$. This string is denoted by "$\alpha \beta \gamma$".
We say that a string $\alpha$ is a concatenation (is decomposable) if and only if there exists strings $\beta$ and $\gamma$ such that $\alpha$ is $\beta \gamma$. In this case we call $\beta$ a left-divisor of $\alpha$ and we call $\gamma$ a right-divisor of $\alpha$. We call both $\beta$ and $\gamma$ divisors of $\alpha$. If $\alpha$ is not decomposable we call it indecomposable or we say that $\alpha$ is a symbol, ie, $\alpha$ has no divisors.
Then I postulate an axiom
(A3) There exists symbols.
Now I need to postulate axioms that imply (implicitly) that this semigroup of strings is freely generated by the "set" of symbols. That is, every string is a uniquelly finite concatenation of symbols. The usual intuitive way to say this is: for every string $\alpha$, there exists a unique finite number of symbols, say $\sigma_1$, $\sigma_2$, ..., $\sigma_n$, such that $\alpha$ is the concatenation $\sigma_1 \sigma_2 ... \sigma_n \, $. But I am not permitted to mention numbers, order or finitude (or to write the dots "...").
First attempt.
I postulate the cancellation laws.
(A4) For any strings $\alpha$, $\beta$ and $\gamma$, if $\alpha \beta$ is $\alpha \gamma$, then $\beta$ is $\gamma$.
(A5) For any strings $\alpha$, $\beta$ and $\gamma$, if $\beta \alpha$ is $\gamma \alpha$, then $\beta$ is $\gamma$.
And some axioms.
(A6) If $\alpha$ is a string that is not a symbol, then there exists a symbol $x$ and a string $\beta$ such that $\alpha$ is $x \beta$.
(A7) If $\alpha$ is a string that is not a symbol, then there exists a symbol $y$ and a string $\beta$ such that $\alpha$ is $\beta y$.
(A8) For any symbols $x$ and $y$ and any strings $\alpha$ and $\beta$, if $x \alpha$ is $y \beta$, then $x$ is $y$.
(A9) For any symbols $x$ and $y$ and any strings $\alpha$ and $\beta$, if $\alpha x$ is $\beta y$, then $x$ is $y$.
From (A6) and (A7), informally, I can recursivelly express a string in a sequence of symbols and a medium string in a way like this $$\alpha = x_1 \alpha_1 y_1 = x_1 x_2 \alpha_2 y_2 y_1 = x_1 x_2 x_3 \alpha_3 y_3 y_2 y_1 = \ ...$$ but I can't see a way to make it stop at a finite number of steps, ie, say that for some natural number $n$ the string $\alpha_n$ is a symbol. This will do the work of the "set" of strings to be generated by the "set" of symbols.
From (A4), (A5), (A8) and (A9) I can do the cancellations recursivelly. That is, if $\alpha$ is $\beta$ and, by above, $$\alpha = x_1 \alpha_1 y_1 = x_1 x_2 \alpha_2 y_2 y_1 = x_1 x_2 x_3 \alpha_3 y_3 y_2 y_1 = \ ...$$ and $$\beta = a_1 \beta_1 b_1 = a_1 a_2 \beta_2 b_2 b_1 = a_1 a_2 a_3 \beta_3 b_3 b_2 b_1 = \ ... \ , $$ then $x_j$ is $a_j$, $y_j$ is $b_j$ and $\alpha_j$ is $\beta_j$, for all $j$. Again halting this process is a thing that I could not manage to do.
Second attempt.
I found the book "The algebraic theory of semigroups" vol. 2, by Clifford and Preston. At page 118, they mention a theorem by Levi and Dubreil-Jacotin transcripted bellow (with a remark)
(L&D-J) The semigroup $S$ is a free semigroup if and only if it satisfies each of the following conditions:
(1) the left and right cancellation laws hold in $S$;
(2) $S$ contains no two-sided identity element;
(3) if $ \ ax=by \ $ for $a$, $b$, $x$, $y$ in $S$, then $ \ a=b \ $ or $ \ a=bu \ $ or $ \ b=av \ $ for $u$, $v$ in $S$;
(4) each element of $S$ has only a finite number of left divisors.
Remark. Conditions (3) and (4) of the preceding theorem can be replaced by their left-right duals, when the conclusion of the theorem will continue to hold.
The condition (1) of the theorem above is (A4) and (A5) togheter. The condition (2) can be easilly made an axiom
(A10) There is no string $\lambda$ with the property that, for any string $\alpha$, $\lambda \alpha$ is $\alpha$ and, for any string $\beta$, $\beta \lambda$ is $\beta$.
Condition (3) of (L&D-J) can be made an axiom rewriting it
(A11) For any strings $\alpha$, $\beta$, $\gamma$ and $\delta$, if $\alpha \gamma$ is $\beta \delta$, then $\alpha$ is $\beta$ or there exist some string $\varepsilon$ such that $\alpha$ is $\beta \varepsilon$ ($\beta$ is a left-divisor of $\alpha$) or there exist some string $\omega$ such that $\beta$ is $\alpha \omega$ ($\alpha$ is a left-divisor of $\beta$).
But condition (4) of the theorem is the problem here. I could not rewrite it without explicit mention the concept of finitude, order or number.
Any help is appreciated.
If we rephrase the question, you are asking for a set of formulae of first-order logic in the signature of semigroups that define uniquely the notion of a free semigroup.
So the question is to find a family
$F = \varphi_1, \varphi_2, \dots$
of first-order sentence in the signature of semigroups (i.e., a binary operator '.'), such that there is exactly one model of $F$ (a model is of $F$ is a structure that satisfies all the sentences of $F$ simultaneously), and this structure corresponds to standard words: let us call $W$ be the structure of standard words:
This is impossible: whatever is the choice of $F$ that you make, if $W$ is a model of $F$, then there exists another model for $F$ that is not isomorphism to $W$. This is a consequence of the compactness theorem in model theory.
Compactness: For all sets of formulae $E$, if every finite subset of $E$ has a model, then $E$ has a model.
Here is how to use is. We throw a new constant $\alpha$. Note first that that you can provide for all $n$ a formula $\psi_n$ that means `the word $\alpha$ has length at least $n$'. Let $E=\{\psi_n\mid n\in\mathbb{N}\}$, and consider now the set of formulae:
$F\cup E$.
Assume now that $F$ has as model the standard words $W$, then every finite subset of $F\cup E$ has a model. Indeed, every finite sets of formulae contain finitely many formulae from $E$, so it is sufficient to take as model $W$ together with $\alpha$ as chosen to be a sufficiently long word. By compactness, this means that there exists a model that satisfies simultaneously all the formulae in $F\cup E$. This model contains thus a word, more precisely $\alpha$, that has the property of being `longer than every number $n$'. This model cannot be the standard one $W$.