Generalized expected Hamming distance

75 Views Asked by At

Let $x,y\in\{0,1\}^n$ be uniformly drawn at random from the set of all length-$n$ bitstrings $\{0,1\}^n$. Let $d(x,y)$ be the Hamming distance of two such bitstrings, i.e. the number of positions at which the bitstrings differ.

Further, let $f:\{0,1\}^n \to \{-1,1\}$ and let $r=\frac{|f^{-1}(-1)|}{2^n}$ be the ratio of bitstrings where $f$ is negative to all bitstrings. I am wondering how to compute $$\underset{x,y}{\mathbb{E}}[f(x)f(y)\;d(x,y)]$$ For symmetry reasons, I would think that this expression should only depend on the ratio $r$. Is this intuition correct? EDIT: No, it's not as pointed out by the comments.

For the constant function $f=+1$ with $|f^{-1}(-1)|=0$, it is easy to calculate the expression. It's just the expected Hamming distance $$\underset{x,y}{\mathbb{E}}[d(x,y)]=n/2.$$ The calculation uses the fact that we know that to every $x$, there are precisely $\binom{n}{k}$ many $y$ at Hamming distance $d(x,y)=k$. So that $$\underset{x,y}{\mathbb{E}}[d(x,y)]=\frac{1}{2^{2n}}\sum_{x,y\in\{0,1\}^n}d(x,y)=\frac{2^n}{2^{2n}}\sum_{k=0}^n \binom{n}{k} k= n/2.$$

However, I am now trying to generalize this calculation to other $f$. I would also be happy, for starters, with ideas on how to bound the expression.