Here are two example papers that use the trellis reasoning approach, which doesn't really click with my brain:
In fact every HMM exposition I've come across seems to duplicate the same work (regarding the forward algorithm. Here I want to prove the recursive formula via induction because it's a perfect application for it!
But I will need some help.
Let $T$ be the length of our observation sequence $O = o_1 \cdots o_T$, and we're trying to prove $P(O|\lambda) = \sum\limits_{i=1}^N \alpha_T(i)$. That is $P(O|\lambda)$ is the probability of seeing the observation sequence $O$ given our model $\lambda = (A, B)$ where $A$ is the $N \times N$ state transition probability matrix and $B$ is the $N \times M$ matrix of observation probabilities at each state.
Note that $P(q_i | q_j) \equiv a_{q_jq_i}$ for any arbitrary state sequence $Q = q_1q_2\cdots$, that is $q_i \in \{1, \dots N\}$ for all $i$. Similarly $P(o_i | q_j) \equiv b_{q_jo_i} \equiv b_{q_j}(o_i)$. By definition that is.
Now if $T = 1$ then let that be our inductive basis case. In that case $P(O| \lambda) \equiv \sum\limits_{|Q| = 1} P(O|Q) P(Q) = \sum\limits_{i=1}^N \pi_{i} b_{i}(O_1) = \sum\limits_{i=1}^N \alpha_1(i)$ since $\alpha_1(i) \equiv P(o_1 ; q_1 = i | \lambda)$ by definition. The basis case is proved.
Now suppose that $P(O'| \lambda) = \sum\limits_{i=1}^N \alpha_{T-1}(i)$ for an observation sequence of length $T-1$. Then it is our job now to prove that, when we append another observation $O = O'\cdot o_T$ that $P(O | \lambda) = \sum\limits_{i=1}^N \alpha_{T}(i)$.
Suppose that the model $\lambda$ has seen the hidden state sequence $Q'$ and at the same time has seen the observation sequence $O'$ with $|O'| = |Q'| = T-1$. Then if we multiply that joint probability $P(O', Q')$ by the probability of arriving at $q_T$ and observing $o_T$ we just need to sum over all such products to get what we're after. Namely,
$P(O | \lambda) = \sum\limits_{i=1}^N P(O', Q') P(q_T | q_{T-1}) P(o_T | q_T)$.
Now here is where I am stuck because we need $q_{T-1}$ to be the end of an arbitrary hidden state sequence but that is not part of the summation limits.
I know this method will work out, intuitively. Just need some help over this symbolic hurdle.
$$P(o_1 o_2 \cdots o_{T}; q_{T} = j| \lambda) = \\ [\sum\limits_{j' = 1}^N P(o_1 o_2 \cdots o_{T-1}; q_{T-1} = j'| \lambda) \cdot P(q_{T}=j| q_{T-1} = j')] \cdot P(o_{T} | q_{T} = j) = \\ [\sum\limits_{j'=1}^N \alpha_{T-1}(j') a_{j'j}] b_{j}(o_T) = \alpha_T (j)$$
Now all you do is sum over $j \in {1, \dots, N}$. Thus a direct proof is sufficient. Not sure how you would apply induction here.