I say a formula $\Phi$ labels a model $g$ as $1$ iff $g \models \Phi$. Otherwise the formula labels the model $-1$. I use $\Phi[g]$ to denote the label that $\Phi$ assigns to $g$.I also assume that $g$ is sampled from some distribution $D$.
My goal is to understand if any FOL formula, labelling graphs of a fixed order $n$, sampled from some distribution $D$, can be approximated by a formula with lower rank or lower number of variables. I state my problem more formally in the following:
Let us assume a function-free first order logic language on graphs of size $n$, i.e. $n$ nodes. Let $G_n$ denote the set of such graphs. Let $D$ be some distribution. For any first order logic sentence $\Phi_{v,k}$, with $v$ variables and quantifier rank $k$, can I have another formula $\Phi'_{v',k'}$, where $v' \leq v$ and $k' \leq k$, and either $v' < v$ or $k' < k$, such that:
$$\mathbb{E}_{g \sim D }\big[\Phi_{v,k}[g] \times \Phi'_{v',k'}[g]\big] > \gamma $$
For some $\gamma \in [0,1]$. Also, assume that $v,k << n$.
My motivation is that this could help, some practical low order approximations of complex structures on graphs. But I am not sure how to attack this, I tried thinking of gaifman normal form, but it adds quantifiers for the normal form, so did not go into much details.