Conditional distribution for a label given a scalar feature

267 Views Asked by At

I am trying to create a simple simulation setup for classifiers on toy data. Each data point can has a scalar feature $X$, which is uniformly distributed between -1 and 1. Depending on the feature, this data point is given a label $T \in\{-1,1\}$. One of the thing I want to specify in this scheme is the fraction of positive instances, $\theta$. Thus:

$P(X=x)= 1/2, x\in [-1,1]$

$P(T=+1) = \theta$ and $P(T=-1) = 1-\theta$

What I am interested in is in the distribution of T, given X. This has to be analytically traceable. I define, somewhat arbitrary, $P(T=+1|X=x) = Ce^{ax}$, with $a$ a tuning parameter and $C$ a constant to make this distribution correct. Fill this in Bayes' rule:

$$P(T=+1|X=x) = \frac{P(X=x|T=+1)P(T=+1)}{P(X=x)} $$ or

$$Ce^{ax} = 2\theta P(X=x|T=+1) $$ integrate over $X$ gives:

$$\int_{-1}^1Ce^{ax} dx= 2\theta \int_{-1}^1P(X=x|T=+1) dx$$ $$\frac{C}{a} (e^a-e^{-a})= 2\theta $$

So I get my constant $C$: $$P(T=+1|X=x) = \frac{\theta a}{\sinh (a)}$$ This is not a good distribution, because , for some values of $a$, for example 1, I obtain probabilities larger than 1: $P(T=+1|X=0.9) = 2.09\theta$, thus incorrect for fractions of positive examples larger than about 0.5.

I have the feeling my reasoning is wrong because I mix probability mass functions with probability density functions.

If I work with cumulative probability functions an analogue reasoning yields:

$$P(T=+1|X\leq x) = \theta \frac{e^{a}-e^{-a}}{(x+1)\sinh (a) }$$ This seems more reasonable, but I don't see how to obtain a function of T, given X.

Does someone see what is wrong with my reasoning? Any suggestions for a function of the probability of the label given the feature with the restrictions of the marginal distribution of X and T?

Thanks in advance!