Let $\theta \sim \mathcal{U}$ and $A(x)$ be defined by the random variable $x$ by:
$$A(x)=x-\lfloor x\rfloor$$
Calculate the MAP estimator $\hat \theta_{MAP}$ for the inputs $A(x)\in [0,0.001]$
$$f(x|\theta)=\frac{2(|\theta|+1)}{\sqrt{2\pi}}e^{-2(x-\theta)^2(|\theta|+1)^2}$$
This is my work so far:
$$\hat \theta_{MAP}=\arg{\max_\theta{P\left(\theta |A(x)\right)}}=\arg{\max_\theta{\frac{P\left( A(x)|\theta\right)P(\theta)}{P\left( A(x)\right)}}}=\arg{\max_\theta{P\left( A(x)|\theta\right)}}$$ This is because $P\left( A(x)\right)$ is irrelevant for the $\arg{\max}$ function and $\theta \sim \mathcal{U}$ so all $P(\theta)$ are equal so it is also irrelevant.
And from here I am not sure on how to continue. By looking at $x\in[i-0.001,i+0.001] i=-k,\ldots,0,1,\ldots,k$. Is the following true?
$$\hat \theta_{MAP}=\arg{\max_\theta{\sum_i P\left( A(x)|\theta,x_i\right)P(x_i|\theta)}}=\arg{\max_\theta{\sum_i P\left( A(x)+i|\theta\right)}}$$
- How do I reduce the $\arg{\max}$ here?
- Is it correct to say that $\hat \theta_{MAP}=k$? If so why?
Assume that distribution of $x$ given $\theta$ has density $p_\theta(\cdot)$. Then, we have $$ \mathbb P(A(x) \in (z,z+dz)) = \sum_{i \in \mathbb Z} p_\theta(z+i)dz $$ so that $A(x)$ will have the density $z \mapsto \sum_{i \in \mathbb Z} p_\theta(z+i)$ on $(0,1)$, assuming that the sum is convergent (which is the case if $p_\theta$ has compact support for example).
Consider the likelihood given $A(x)$, i.e., $$ F(\theta;z) := \sum_{i \in \mathbb Z} p_\theta(z+i). $$ The MAP estimator is the same as the MLE in this case. Note that since the prior is supported on $[0,1]$, the posterior is also supported on $[0,1]$. The posterior will be proportional to $F(\theta;z)$ on $[0,1]$ and zero elsewhere. We have: $$ \hat \theta := \arg \max_{\theta \in [0,1]} F(\theta;A(x)) $$ which can't be simplified further without knowing more about $p_\theta$. We can find the maximum by plotting $F(\theta;A(x))$ as a function of $\theta$.
For the case where $x$ given $\theta$ is Gaussian with mean $\theta$ and variance $\frac1{4(1+|\theta|)^2}$:
The functions $\theta \mapsto f(z+i|\theta)$ are centered around $z+i$ and die down very quickly when moving away from the center. They also become taller for increasing $i$.
Since we are only concerned with $\theta \in [0,1]$, a plot helps showing that $F(\theta,0.001)$ is maximized at $\hat \theta = 1$ over $[0,1]$. In fact there seems to be a threshold $\tau$ such that as long as $A(x) \in [0,\tau]$ we have $\hat \theta = 1$, and $\tau \in [0.22,0.23]$ it seems.
(As another example, if $A(x) = 0.5$, we have nontrivial $\hat\theta \approx 0.56$.)
The hint in the problem is perhaps trying to say this: $$ F(\theta;z) \approx \sum_{|i| \le k} p_\theta(z+i), \quad \text{for}\; \theta \in [0,1] $$ where $k$ even as small as $2$ or $3$ seems to work.