I am trying to answer this question :
How can we determine the proportions of a box, only by listening to the "noise" it emits, when a particle evolving randomly inside this box bounces on its walls. English isn't my first langage, I hope this is well formulated.
For now, let's play with only 1-dimension. My particle starts at $x=0$ and evolve with descrete steps $x+1$ or $x-1$ (the first step must be $x+1$). When the particle hits a "wall", it emits a sound $i$ when $x=0$ and a sound $a$ when $x=X$, with $X$ the lenght of the 1D box, it emits the sound $0$ otherwise. For example, we could "hear" $(i, 0, 0, 0, i, 0, 0, 0,0, a, 0, 0, 0, 0, i)$. By hypothesis, the particle must start and finish at $x=0$.
I know that these kind of trajectories can be well described with Dyck paths (here, all Dyck paths have the same probability to happen). For now, I'm trying to figure out how to calculate the averages $I_n(X)$ and $A_n(X)$ of $i$'s and $a$'s as a function of the lenght X.
If X=1, the only path we can have is $(i, a, i, a, ..., a, i)$, so if $n$ is big enough, we have $I_n(1)$=$A_n(1)$= $\frac{1}{2}$
If X=2, for $n$ big enough, I find by drawing that $I_n(2)$=$A_n(2)$= $\frac{1}{4}$, and 50% of silence
How can I proove this and more important, how can I figure out for all $X$? The only pattern I can see here is $I_n(X)$=$A_n(X)$=$\frac{1}{2Cat(X)}$ but I'm pretty sure this is stupid. Here Cat(X) is the $X$th Catalan number. Any help about how to structurate my ideas would be awesome.
Have a good day!
This random walk is a random walk on a line graph. A random walk on any connected undirected graph spends the same proportion of time on any edge, so the proportion of time it spends on a vertex is proportional to the degree of the vertex.
In your case, the wall vertices have degree $1$ and the other $X-1$ vertices have degree $2$, so the proportion of time the walk spends at either wall vertex is
$$ \frac1{1+2(X-1)+1}=\frac1{2X}\;. $$
Edit in response to the comment:
If you want to get a quick estimate from limited data, you can use the fact that an $a$ is followed (with some intervening $0$s) by an $i$ (and vice versa) with probability $\frac1X$: If you have $m$ switches from $a$ to $i$ or vice versa, and $n$ returns from $a$ to $a$ or from $i$ to $i$, the likelihood for this (conditional on a total of $n+m$ transitions) is
$$ \binom{n+m}m\left(\frac1X\right)^m\left(1-\frac1X\right)^n\;. $$
Taking the logarithm and setting the derivative with respect to $\frac1X$ to $0$ yields
$$ m\left(1-\frac1X\right)=n\cdot\frac1X $$
and thus $X=1+\frac nm$ as a maximum likelihood estimate of $X$. If the data includes switches, the odd or even number of zeros in between requires $X$ to be even or odd accordingly.
You can also use the full information in the number of $0$s in between, but then you indeed need to use Dyck words to calculate the probabilities of these numbers depending on $X$.
Edit:
Here’s how to make use of the full information you have by counting Dyck paths.
Theorem $10.3.3$ of the article Lattice Path Enumeration by C. Krattenthaler gives the count of lattice paths from $(a,b)$ to $(c,d)$ that don’t go below the line $y=x+s$ or above the line $y=x+t$ as
$$ \sum_{k\in\mathbb Z}\left(\binom{c+d-a-b}{c-a+k(t-s+2)}-\binom{c+d-a-b}{c-b+k(t-s+2)+t+1}\right)\;. $$
If we start at $1$ and end at $1$ after $2n$ steps without hitting $0$ or $X$, we can map this to a lattice path that starts at $(0,0)$, ends at $(n,n)$ and doesn’t go below $y=x$ or above $y=x+X-2$; so we have $a=b=0$, $c=d=n$, $s=0$ and $t=X-2$. Thus the count of these paths is
$$ \sum_{k\in\mathbb Z}\left(\binom{2n}{n+kX}-\binom{2n}{n+kX-1}\right)\tag1\label1\;. $$
If we start at $1$ and end at $X-1$ after $X-2+2n$ steps without hitting $0$ or $X$, we can map this to a lattice path that starts at $(0,0)$, ends at $(n,n+X-2)$ and doesn’t go below $y=x$ or above $y=x+X-2$; so we have the same parameters as before except now $d=n+X-2$, which only affects the upper index. Thus the count of these paths is
$$ \sum_{k\in\mathbb Z}\left(\binom{2n+X-2}{n+kX}-\binom{2n+X-2}{n+kX-1}\right)\tag2\label2\;. $$
For a return from $i$ to $i$ or from $a$ to $a$ with $j$ zeros in between, the path count is given by $\eqref1$ with $n=\frac{j-1}2$, and for a switch from $i$ to $a$ or from $a$ to $i$ with $j$ zeros in between it is given by $\eqref2$ with $n=\frac{j-X+1}2$.
The likelihood of the data given $X$ is the product of these counts for all transitions, divided by a power of $2$ that doesn’t depend on $X$. Note that this is the likelihood of the data given $X$, not the probability of $X$ given the data. This latter probability depends on your prior beliefs about $X$.
Let’s apply this to the example you consider in the question, $(i,0,0,0,i,0,0,0,0,a,0,0,0,0,i)$. (In a comment you wrote $(i,0,0,0,i,0,0,0,a,0,0,0,i)$ instead, but I assume that this was a typo, since this sequence is only possible for $X=4$.)
From the sequence $(i,0,0,0,i,0,0,0,0,a,0,0,0,0,i)$, we know that $X\ge2$ (else there wouldn’t be any $0$s), that $X\le5$ (else the particle couldn’t reach $a$ from $i$ in $5$ steps), and that $X$ is odd (again, else the particle couldn’t reach $a$ from $i$ in $5$ steps). Thus $X=3$ or $X=5$.
For $X=3$, the likelihood for the return $(i,0,0,0,i)$ is, from $\eqref1$ with $n=1$ and $X=3$,
$$ \sum_{k\in\mathbb Z}\left(\binom2{1+3k}-\binom2{0+3k}\right)=\binom21-\binom20=2-1=1\;. $$
(I’m omitting the power of $2$, since it doesn’t depend on $X$ and thus has no bearing on relative likelihoods, which is all we care about here.) The likelihood for each of the switches $(i,0,0,0,0,a)$ and $(a,0,0,0,0,i)$ is, from $\eqref2$ with $n=1$ and $X=3$,
$$ \sum_{k\in\mathbb Z}\left(\binom3{1+3k}-\binom3{0+3k}\right)=\binom31-\binom30-\binom33=3-1-1=1\;. $$
Thus the total likelihood for $X=3$ is the product, $1\cdot1^2=1$.
For $X=5$, the likelihood for the return $(i,0,0,0,i)$ is, from $\eqref1$ with $n=1$ and $X=5$,
$$ \sum_{k\in\mathbb Z}\left(\binom2{1+5k}-\binom2{0+5k}\right)=\binom21-\binom20=2-1=1\;. $$
The likelihood for each of the switches $(i,0,0,0,0,a)$ and $(a,0,0,0,0,i)$ is, from $\eqref2$ with $n=0$ and $X=5$,
$$ \sum_{k\in\mathbb Z}\left(\binom3{0+5k}-\binom3{-1+5k}\right)=\binom30=1\;. $$
So it turns out that this is actually a bit of a boring example, since for each transition and each possible value of $X$ there’s exactly one path. (We could of course also have figured this out by elementary considerations without all the binomial coefficients.) Thus the data has the same likelihood for either value of $X$.
So let’s take a slightly more interesting example and insert two more $0$s in the last transition. The same values $X=3$ and $X=5$ are still possible, but now the last transition has different numbers of paths for these two possibilities: For $X=3$, the likelihood for the switch $(a,0,0,0,0,0,0,i)$ is, from $\eqref2$ with $n=2$ and $X=3$,
$$ \sum_{k\in\mathbb Z}\left(\binom5{2+3k}-\binom5{1+3k}\right)=\binom52+\binom55-\binom51-\binom54=10+1-5-5=1\;, $$
whereas for $X=5$, it is, from $\eqref2$ with $n=1$ and $X=5$,
$$ \sum_{k\in\mathbb Z}\left(\binom5{1+5k}-\binom5{0+5k}\right)=\binom51-\binom50-\binom55=5-1-1=3\;. $$
Thus (since the other path counts are still the same), the data are $3$ times more likely for $X=5$ than for $X=3$.
Now, if you wanted to calculate probabilities for the values of $X$, you’d need to specify your prior beliefs about $X$. For simplicity, let’s assume a uniform prior; that is, prior to the experiment, you considered all values of $X$ equally likely. Then the probabilities would be $P(X=3)=\frac1{1+3}=\frac14$ and $P(X=5)=\frac3{1+3}=\frac34$.