I am trying to create a simple simulation setup for classifiers on toy data. Each data point can has a scalar feature $X$, which is uniformly distributed between -1 and 1. Depending on the feature, this data point is given a label $T \in\{-1,1\}$. One of the thing I want to specify in this scheme is the fraction of positive instances, $\theta$. Thus:
$P(X=x)= 1/2, x\in [-1,1]$
$P(T=+1) = \theta$ and $P(T=-1) = 1-\theta$
What I am interested in is in the distribution of T, given X. This has to be analytically traceable. I define, somewhat arbitrary, $P(T=+1|X=x) = Ce^{ax}$, with $a$ a tuning parameter and $C$ a constant to make this distribution correct. Fill this in Bayes' rule:
$$P(T=+1|X=x) = \frac{P(X=x|T=+1)P(T=+1)}{P(X=x)} $$ or
$$Ce^{ax} = 2\theta P(X=x|T=+1) $$ integrate over $X$ gives:
$$\int_{-1}^1Ce^{ax} dx= 2\theta \int_{-1}^1P(X=x|T=+1) dx$$ $$\frac{C}{a} (e^a-e^{-a})= 2\theta $$
So I get my constant $C$: $$P(T=+1|X=x) = \frac{\theta a}{\sinh (a)}$$ This is not a good distribution, because , for some values of $a$, for example 1, I obtain probabilities larger than 1: $P(T=+1|X=0.9) = 2.09\theta$, thus incorrect for fractions of positive examples larger than about 0.5.
I have the feeling my reasoning is wrong because I mix probability mass functions with probability density functions.
If I work with cumulative probability functions an analogue reasoning yields:
$$P(T=+1|X\leq x) = \theta \frac{e^{a}-e^{-a}}{(x+1)\sinh (a) }$$ This seems more reasonable, but I don't see how to obtain a function of T, given X.
Does someone see what is wrong with my reasoning? Any suggestions for a function of the probability of the label given the feature with the restrictions of the marginal distribution of X and T?
Thanks in advance!