Posterior with a Beta prior and a liner function of Bernoulli data.

Question

Posterior with a Beta prior and a liner function of Bernoulli data.

159 Views Asked by Bumbble Comm At 04 Apr 2026 - 7:00

Let $(a,b)\in\mathbb{R}^2$ be fixed, known constants that satisfy$|a|+|b|<0.5$.$^1$ Let $\theta$ have a $\text{Beta}(\alpha,\beta)$ prior. Given an observation of $X\sim\text{Bernoulli}(a+b\theta)$, what is the posterior distribution of $\theta$?

I am having trouble calculating this. Is it still $\text{Beta}(\cdot,\cdot)$?

$^1\ $ This just ensures that $X$'s distribution is well-defined.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

The posterior density is proportional to

$$\begin{align} f(\theta \mid x) &\propto f_X(x \mid \theta, a, b) p(\theta) \\ &\propto (a + b\theta)^x (1 - (a + b\theta))^{1-x} \theta^{\alpha - 1} (1 - \theta)^{\beta - 1} \\ &= \begin{cases} (a + b \theta)\theta^{\alpha - 1} (1 - \theta)^{\beta - 1}, & x = 1 \\ (1 - a - b\theta)\theta^{\alpha - 1} (1 - \theta)^{\beta - 1}, & x = 0,\end{cases} \end{align}$$ So no, it is not in general beta distributed because this cannot be written in the form $\theta^{\alpha^* - 1} (1 - \theta)^{\beta^* - 1}$ for suitable posterior hyperparameters $\alpha^*, \beta^*$ that are functions of $x, a, b$.

Regarding your follow-up question, the obvious choice of prior would be a suitable location-scaled beta, where the location parameter is $a$ and the scale is $b$:

$$p(\theta \mid a, b) = b \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} (a + b \theta)^{\alpha-1} (1 - (a+b \theta))^{\beta - 1}, \quad -\frac{a}{b} < \theta < \frac{1-a}{b},$$ since $a, b$ are known. Then the posterior for a sample of size $n$ will be proportional to

$$f(\theta \mid x, a, b) \propto (a + b\theta)^{x+\alpha-1} (1 - (a + b\theta))^{n-x+\beta-1},$$ with the support of $\theta$ remaining the same. Thus this prior is conjugate and the posterior hyperparameters are $$\alpha^* = x+\alpha, \quad \beta^* = n-x+\beta.$$

Basically, what we are doing is Bayesian inference on the transformed parameter $$\psi = a + b \theta,$$ with the usual binomial-beta model. Then any properties of $\psi$ (e.g., posterior distribution, credible interval, highest posterior density interval) are simply back-transformed to obtain the corresponding inference on $\theta$.

There is one caveat: if we know in advance the parameter $\theta \in (0,1)$, then this choice of prior will not work, because we are restricted on some subinterval depending on the values of $a$ and $b$: specifically, $$L = \max(0, -a/b) < \theta < \min(1, (1-a)/b) = U.$$ In this case, the posterior hyperparameters still obey the above relationship, but the distribution for $\theta$ is what might be called a "location-scale transformed, $(0,1)$-truncated" beta. The normalizing constant becomes complicated, since a factor of $$\int_{\theta = L}^U p(\theta \mid a, b) \, d\theta$$ needs to be introduced.

Let's look at a specific example. Suppose we have $a = -1/4$, and $b = 1/5$. Then the observations $X_i$ will be Bernoulli with success parameter $\psi = -1/4 + \theta/5$ for some random variable $\theta$. If there are no additional restrictions on $\theta$, then the requirement $0 < \psi < 1$ means we would have $5/4 < \theta < 25/4$, and we may proceed with inference using the transformed prior $\psi$, then back-transforming to get $\theta$. Specifically, a uniform prior ($\alpha = \beta = 1$) would be $$p(\psi \mid a = -1/4, b = 1/5) = \begin{cases}1, & 0 < \psi < 1, \\ 0, & \text{otherwise,} \end{cases}$$ which if written for $\theta$ would be $$p(\theta \mid a = -1/4, b = 1/5) = \begin{cases}1/5, & 5/4 < \theta < 25/4, \\ 0, & \text{otherwise}. \end{cases}$$ And a Beta prior with $\alpha = 2$, $\beta = 1$ would give us $$p(\psi \mid a = -1/4, b = 1/5) = \begin{cases} 2\psi, & 0 < \psi < 1, \\ 0, & \text{otherwise}, \end{cases}$$ and $$p(\theta \mid a = -1/4, b = 1/5) = \begin{cases} 2\theta/5, & 5/4 < \theta < 25/4 \\ 0, & \text{otherwise}. \end{cases}$$

But notice that if we imposed an additional condition that $0 < \theta < 1$, then $\psi$ will not be in $(0,1)$.

So now consider a second example: $a = 1/4$, and $b = 1/3$. In this case, the requirement $0 < \psi < 1$ implies $-3/4 < \theta < 9/4$, which has nonempty intersection with $(0,1)$. So if $0 < \theta < 1$ is an added constraint, this would mean $1/4 < \psi < 7/12$, so a uniform prior for $\psi$ would need to be $$p(\psi \mid a = 1/4, b = 1/3) = \begin{cases} 3, & 1/4 < \psi < 7/12, \\ 0, & \text{otherwise}. \end{cases}$$ The fact that the support for $\psi$ is not $(0,1)$ is due to the constraint on $\theta$--we had to truncate it. Then in terms of $\theta$, we would have the uniform prior $$p(\theta \mid a = 1/4, b = 1/3) = \begin{cases} 1, & 0 < \theta < 1 \\ 0, & \text{otherwise}; \end{cases}$$ namely, $\theta$ has full support on $(0,1)$ because we truncated both ends of $\psi$. But the picture is a bit more complicated when the prior is not uniform; e.g., in the $\alpha = 2$, $\beta = 1$ case, the prior for $\psi$ is $$p(\psi \mid a = 1/4, b = 1/3) = \begin{cases} \frac{36}{5} \psi, & 1/4 < \psi < 7/12, \\ 0, & \text{otherwise}, \end{cases}$$ where $36/5$ comes from dividing the Beta $(\alpha = 2, \beta = 1)$ distribution on $0 < \psi < 1$ by the probability of the interval we are keeping, which is the normalizing factor $$\int_{\psi = 1/4}^{7/12} 2\psi \, d\psi = \frac{5}{18}.$$ Then back-transforming for $\theta$ gives $$p(\theta \mid a = 1/4, b = 1/3) = \begin{cases} (4\theta + 3)/5, & 0 < \theta < 1, \\ 0, & \text{otherwise}. \end{cases}$$

Posterior with a Beta prior and a liner function of Bernoulli data.

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in PROBABILITY-THEORY

Related Questions in PROBABILITY-DISTRIBUTIONS

Related Questions in BAYESIAN

Related Questions in BAYES-THEOREM

Trending Questions

Popular # Hahtags

Popular Questions