MLE in introductory probabilistic information theory

40 Views Asked by At

Consider sending a bit that is either $\{0,1\}$ through a noisy symmetric channel, such that for a given input $x$ and a given (potentially noisy) output $y$, $\forall i,j \in \{0,1\}. P(y = i | x = j) = \left\{\begin{array}{cc} 1-f&i = j\\ f&i \neq j\end{array}\right.,$ where $f \in [0,1].$ Consider sending $n$ copies of the same input bit $x$ (call this copy vector $\mathbf{x}$) and recieving the output bit vector $\mathbf{y}$ (where $\mathbf{y}_i$ corresponds to the output after sending $\mathbf{x}_i$ throught the channel). Take $O = |\{i \in [n]| \mathbf{y}_i = 1\}|$ and $Z = |\{i \in [n] | \mathbf{y}_i = 0\}|$. I want to show that $$\hat{x} = \left\{\begin{array}{cc} 1 & O > Z \\ 0&\text{otherwise}\end{array} \right.$$

Is the maximum likelihood estimator for $x$. However, I am having difficulty formalizing this as a likelihood problem in terms of the data outputed. Any recommendation on how to go about proving this?

1

There are 1 best solutions below

0
On BEST ANSWER

$$P(\mathbf{y} \mid \mathbf{x}) = \frac{P(\mathbf{y} \mid\mathbf{x}) P(\mathbf{x})}{P(\mathbf{y})} $$

We are seeking the ML $\mathbf{\hat{x}}$, the value of $\mathbf{x}$ that maximizes the above for a given $\mathbf{y}$. First note, that the denominator is the same for all $\mathbf{x}$, hence we can just maximize the numerator. Further, $P(\mathbf{x})$ is zero except for the constant $n-$ tuples $\bf{x}=\bf{0}$ and $\bf{x}=\bf{1}$. And, given that the channel has no memory (white noise), the conditional probability factors:

$$ P(\mathbf{y} \mid\mathbf{x}={\bf 0}) = \prod P(y_i \mid x_i=0) = f^O (1-f)^Z $$

$$ P(\mathbf{y} \mid\mathbf{x}={\bf 1}) = \prod P(y_i \mid x_i=1) = (1-f)^O f^Z $$

Assuming $P(\mathbf{x}={\bf 0})=P(\mathbf{x}={\bf 1})=1/2$ (this data is missing in your problem statement) and that $f < 1/2$ (again, this data is missing), we get the above ML estimate (majority voting).