I'm taking a lecture where we proved $ L \subseteq \mathsf{REG(\Sigma^*)} \iff \exists \phi \in \mathsf{MSO}$ with $L(\phi) = L$. For the first direction $L \subseteq \mathsf{REG(\Sigma^*)} \implies \exists \phi \in \mathsf{MSO}, L(\phi) = L$ we were shown the following:
Proof:
We define $\mathsf{MSO}$ for directed graphs of the form $G=(V, E ,\lambda)$ where $V$ is the set of vertices, $E$ the set of edges and $E \subseteq V \times V$ and a labeling function $\lambda : V \to \Sigma$, with $\Sigma$ being an finite alphabet.
So we label every vertice with a letter $a\in \Sigma$. We use variables $x$ for elements from $V$ and $X$ for subsets of $V$ (monadic pradicates).
Let $L= L(A)$ for some NFA $A= (Q, \Sigma, \delta, I, F)$. We now construct $\phi \in \mathsf{MSO}$ with: $$w \in L(A) \iff w \models \phi$$
Therefore let $w = a_1 \ldots a_n$ with $a_i \in \Sigma$ for $1\leq i \leq n$. Then also $V = \{1, \ldots, n\}$ and $E=\{(i, i+1) \mid 1 \leq i <n\}$, and $\lambda(i) = a_i$.
Let $Q = \{0, \ldots, m\}$. Then there exists set variables $X_0, \ldots, X_m$, such that $\phi$ can be set to:
$$\phi = \Bigl\lbrack(\exists X_q : 1 \in X_q \land q \in I) \land (\exists X_q : n \in X_q \land q \in F) \land (\bigwedge_{p, q \in Q} (p \not = q \implies X_p \cap X_q = \emptyset)) \, \,\land $$
$$ \forall x \forall y ( \bigwedge_{p, q, a} (x \in X_p \land \lambda(x)= a \land (x,y \in E)) \implies \bigvee_{(p,q,a) \in \delta} y \in X_q)\Bigr\rbrack$$
Questions:
1.) Why is it enough to use only one word for constructing $\phi$? I thought $\phi$ should describe all words from the language $L$. If that isn't the case, and we construct a new formulae for every $w\in L$, how is that done for infinte languages like for example $ L = a^*$?
2.) What exactly is the purpose of the graph? In "Finite Model Theory" by Leonid Libkin (page 124), Theorem 7.21 there is a similar proof, but I still don't understand why and how we use graphs with the formulaes.
Instead of using graphs, it is easier to interpret a word $u = a_1 \dotsm a_n$ as a finite structure ${\cal M}_u =\{1, \ldots, n\}$ equipped with the natural order $<$ on ${\cal M}_u$ and, for each letter $a$, a predicate $R_a$ giving the positions of the letter $a$ in $u$. For isntance, if $u = ababba$, then ${\cal M}_u = \{1, \ldots, 6\}$, $R_a = \{1,3,6\}$ and $R_b = \{3, 4, 5\}$.
If you use finite graphs, you simply identify a word $a_1 \cdots a_n$ with the graph $G_u$ $$ (0) \xrightarrow{a_1} (1) \xrightarrow{a_2} (2) \rightarrow \cdots \rightarrow (n-1) \xrightarrow{a_n} (n) $$ So your set $V$ should rather be $\{0, 1, \dots, n\}$.
Now you have \begin{align} L(\phi) &= \{ u \in \Sigma^+ \mid u \text{ satisfies } \phi \} \\ &= \{ u \in \Sigma^+ \mid {\cal M}_u \models \phi \} \quad &&\text{(ordered structure version)}\\ &= \{u \in \Sigma^+ \mid G_u \models \phi \} &&\text{(graph version)} \end{align} The proposed formula $\phi$ just encodes the behaviour of the NFA $\cal A$ accepting $L$. If $$ q_0 \xrightarrow{a_1} q_1 \xrightarrow{a_2} q_2 \rightarrow \cdots \rightarrow q_{n-1} \xrightarrow{a_n} q_n $$ is a path in $\cal A$ and $q \in Q$, let $X_q = \bigl\{i \in \{0, \ldots, n\} \mid q_i = q\bigr\}$, then $\phi$ (slightly modified) can be interpreted as follows: \begin{align} &\exists X_q \quad 0 \in X_q \land q \in I && \text{$0 \in X_q$ for some initial state $q$}\\ {} \land {} &\exists X_q \quad n \in X_q \land q \in F && \text{$n \in X_q$ for some final state $q$}\\ {} \land {} &\bigwedge_{p, q \in Q} \bigl(p \not = q \implies X_p \cap X_q = \emptyset\bigr) && \text{the sets $X_q$ are pairwise disjoint}\\ {}\land{} & \forall x \forall y\ \bigwedge_{p, q, a} \Bigl( \bigl(x \in X_p \land \lambda(x)= a \land (x,y \in E)\bigr) && \text{if $i \in X_p$ and if $a$ is a letter,} \\ &\implies \bigvee_{(p,q,a) \in \delta} y \in X_q\Bigr) && \text{then $i+1 \in X_q$ for some state $q$ such that $p \xrightarrow{a} q$ is a transition} \end{align}