Is the Bayesian Prior representing a hypothesis with no data, or with all data?

Question

Is the Bayesian Prior representing a hypothesis with no data, or with all data?

90 Views Asked by Bumbble Comm At 03 Apr 2026 - 10:26

I have an understanding question about the Bayes' Theorem: in

$$p(z|x) = \frac{p(x|z)p(z)}{p(x)},$$

the term $p(z)$ is usually interpreted as the prior probability distribution of a hypothesis $z$ before observing any data $x$.

However, if we write $p(z)$ as the marginal

$$p(z) = \int p(z, x) dx = \int p(z|x)p(x) dx= \mathbb{E}_{x\sim p(x)}p(z|x),$$

then the term $p(z)$ seems to contain the knowledge about all data $x$.

Therefore, is the prior really representing the hypothesis with no data, or with all data?
We are not any smarter with all data than we are with no data?
Or is it a question of perspective?
How should I understand the prior correctly?

Thank you!

Original Q&A

There are 2 best solutions below

Bumbble Comm On 21 May 2021 - 5:08

The prior encodes the asker's existing belief about the state of the world. This may be in context of prior knowledge that's given to you (if the question rests on certain assumptions), or as an entire philosophy.

For the former, suppose you've been told that before you flipped a coin, it's been observed previously to come up heads 99% of the time. Depending on how strongly you decide to weight this as evidence, you may decide it should count "as if" you've seen several extra flips. This leads to the concept of conjugate priors, which are mathematically convenient ways to have the posterior be the same form as the prior - which really exposes the correspondence that Bayesian inference is updating your prior with additional evidence.

You may, at one extreme, decide this is conclusive information and assign this infinity weight - you will in effect ignore any and all evidence to the contrary. The frequentist side would completely ignore the pre-existing information and estimate the coin purely based on what was observed in the experiment.

As a philosophy, Bayesian statistics fundamentally rejects the frequentist assumption that there is a fully objective description of the world independent of the asker's experience and belief. See for example this XKCD for a humourous comparison.

There are particular choices which may be better suited to expressing complete ignorance, for example the Jeffreys prior, but even these may be arguable on whether "minimizing information" truly embodies "ignorance".

**Bumbble Comm** · Accepted Answer

It should be interpreted as representing the uncertainty in the hypothesis with no data.

The marginalization computation that you've written should not be viewed as "using information from more data," but rather as averaging over all possible outcomes of what the data $x$ could be. This is less informative than if you have a particular instance of the data $x$, which gives you extra information about $z$.

Is the Bayesian Prior representing a hypothesis with no data, or with all data?

There are 2 best solutions below

Related Questions in BAYESIAN

Trending Questions

Popular # Hahtags

Popular Questions