Probability of complement as fraction

Question

Probability of complement as fraction

45 Views Asked by Bumbble Comm At 26 Mar 2026 - 2:44

I am trying to work through the paper "Repairing Neural Networks by Leaving the Right Past Behind" (arxiv). And really struggle working through the mathematics. The paper states that the key idea is that they can express:

$$ p(\mathcal{D}\setminus\mathcal{C} | \theta) = p(\mathcal{D}|\theta) / p(\mathcal{C}|\theta), \quad\forall\mathcal{C} \subset \mathcal{D} $$

This is possible due to the "i.i.d. modelling assumption".

I tried to understand this formulation with my (limited) intuition, reformulations and even by placing sets and calculating the conditional probabilities by hand and neither matches the above formulation.

Under what condition is the above formulation correct and what is the intuition behind it?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2023-08-25 00:32:17

The symbols, $\cal D, C$, represent sets of data taken from independent and identically distributed sampling (from some distribution with parameter $\theta$).

Therefore, the data in $\cal D\smallsetminus C$ is conditionally independent from the data in $\cal D\cap C$ for given $\theta$, since these parts of $\cal D$ are disjoint.

Further, the text is specifying that $\cal \forall C\subset D$ , which means that: $\cal C = D\cap C$.

And so we have this:

$$\begin{align}p(\mathcal D\mid\theta) &=p(\mathcal{(D\smallsetminus C)\cup(D\cap C)}\mid\theta)&&\text{by definition of the union}\\ &=p(\mathcal{D\smallsetminus C}\mid\theta)\cdot p(\mathcal{D\cap C}\mid\theta)&&\text{by independence} \textit{ of the data} \text{ given } \theta\\ &=p(\mathcal{D\smallsetminus C}\mid\theta)\cdot p(\mathcal C\mid\theta)&&\text{when }\mathcal{C\subset D}\\[2ex]\therefore\quad p(\mathcal{D\smallsetminus C}\mid\theta) &= p(\mathcal D\mid\theta) / p(\mathcal C\mid\theta)&&\forall \mathcal{C\subset D}\end{align}$$

That is all.

$p(\mathcal E\mid\theta)$ is the probability for obtaining data points in the $\lvert\mathcal E\rvert$ trials given identical parameter $\theta$.

To clarify: the sets are not events, the data points are, and the sets are a conjunction of events from separate trials; so a union of sets of data is the conjunction of those events.

Thus for independent sets of data, $\mathcal E$ and $\mathcal F$, we will have $p(\mathcal{E\cup F}\mid\theta)=p(\mathcal E\mid\theta)\cdot p(\mathcal F\mid\theta)$ .

Probability of complement as fraction

There are 1 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in CONDITIONAL-PROBABILITY

Related Questions in BAYESIAN-NETWORK

Trending Questions

Popular # Hahtags

Popular Questions