Consider a string of length $n \geq 3$ over an alphabet $\{1,\dots, \sigma\}$. An edit operation is a single symbol insert, deletion or substitution. The edit distance between two strings is the minimum number of edit operations needed to transform one string into the other one. Given a string $S$ of length $n$ with $S_i \in \{1,\dots, \sigma\}$, my question relates to the number of distinct strings which are edit distance at most $3$ from $S$.
Let us write $g_{k, \sigma}(S)$ for the number of distinct strings over the alphabet $\{1,\dots, \sigma\}$ which are edit distance at most $k$ from $S$, i.e. $g_{k,\sigma}(S) = |\{S' : d(S', S) \leq k\}|$ where $d(-,-)$ is the edit distance.
Let $X_n$ be a random variable representing a random string over the alphabet $\{1,\dots, \sigma\}$ of length $n$, with the symbols chosen uniformly and independently.
This leads directly to my question:
Let $X_n$ be a random variable representing a random string of length $n$, with the symbols chosen uniformly and independently. What is:
$$\mathbb{E}(g_{3, \sigma}(X_n))\;?$$
For $\sigma=2$ we can get an explicit formula $(40+6n-4n^2)/2^n-83/2+(331/12)n-6n^2+(2/3)n^3$. So my question is, what does the dependency on the alphabet size $\sigma$ look like?
Varying v. Unchanged String Length
If, as you initially indicated in response to my comment, the length of the transformed string can differ from the length of the original, then this problem becomes vastly more difficult because the set of distinct editing operations (operations that might potentially yield a distinct result) includes all 18 of the following:
Whenever multiple insertions or multiple deletions are performed, moreover, counting becomes inordinately difficult. If, on the other hand, we require that the length remain unchanged, we have only 6 editing combinations to consider and the problem becomes more tractable because none of those 6 combinations involves multiple insertions or multiple deletions. Indeed, the counting for each of the six cases becomes relatively straightforward; the trickiest bit is discounting to avoid double-counting instances when two different editing operations will produce the same string--a problem solved in an answer to another question.
The Six Cases and the Danger of Overcounting
To get our bearings initially, we can generalize this logic:
A fine-grained consideration of the five possible types of single edits thus yields:
We can now apply that basic logic to each of our six cases:
no edits
Performing no edits whatsoever yields only the original string, so 1 result for this case.
one substitution
There are $n$ different symbols and $\sigma-1$ ways each can be substituted into a different symbol, so $n(\sigma-1)$ results.
two substitutions
There are $\binom{n}{2}$ different pairs and $(\sigma-1)^2$ ways to modify each: $\binom{n}{2}(\sigma-1)^2$ results.
three substitutions
There are $\binom{n}{3}$ different trios and $(\sigma-1)^3$ ways to modify each: $\binom{n}{3}(\sigma-1)^3$.
one deletion, one insertion, no substitutions
For this case, we can generalize this solution for $\sigma=2$ to any $\sigma$, using the same logic to avoid double-counting those instances where two substitutions would yield the same result as one deletion and one insertion.
Because the tricky logic to prevent double-counting carries directly over, the only modification required is to substitute a variable $\sigma$ for the fixed $\sigma=2$:
$2\cdot\frac{1}{\sigma}\sum_{k=3}^n(n+1-k)=2\cdot\frac{1}{\sigma}\sum_{k=1}^{n-2}k=\frac{(n-1)(n-2)}{\sigma}\;$
The overcount of results that have already been tallied as two substitutions can be calculated as follows when $\sigma=2$:
Again, our only modification is to substitute $\sigma$ for 2:
$\sum_{k=3}^n\left(\frac1{\sigma}\right)^{k-2}(n+1-k)=\sum_{k=1}^{n-2}\left(\frac1{\sigma}\right)^{n-k-1}k=n-3+{\sigma}^{-(n-2)}\;$
Swapping in $\sigma$ a final time yields:
$\sum_{k=3}^n\left(\frac1{\sigma}\right)^{k-1}(n+1-k)\;$
These two overcounts (which, alas, cannot be combined as cleanly as when the symbols are binary) are then subtracted from the initial count of deletion/insertion operations to yield the overall results produced by this case, but not by case 3 above:
$\frac{(n-1)(n-2)}{\sigma}\ - \left(n-3+{\sigma}^{-(n-2)}\right) - \sum_{k=3}^n\left(\frac1{\sigma}\right)^{k-1}(n+1-k)\;$
That same calculation carries over to the final case. Here, however, each combination of one deletion and one insertion--likewise discounted to avoid double-counting the triple substitutions already tallied in case 4 above--is accompanied by a third edit: a substitution involving one of the $n-1$ original symbols remaining after the deletion. Since each of these $(n-1)$ symbols admits $(\sigma-1)$ novel substitutions, the total count for the sixth and final case becomes:
$\left(\frac{(n-1)(n-2)}{\sigma}\ - \left(n-3+{\sigma}^{-(n-2)}\right) - \sum_{k=3}^n\left(\frac1{\sigma}\right)^{k-1}(n+1-k)\right)(n-1)(\sigma-1);$
Summing the (previously uncounted) results produced by each of these six cases should yield the expected count when the length of the string remains unchanged. It's ugly (perhaps unnecessarily), but I hope correct.