I'm rebuilding the methods described in the paper Strong statistical parity through fair synthetic data, and on page 3 it describes the following methodology:
We align both distributions by learning a linear function f which transforms a set of M = 100 equidistantly spaced quantiles of $P(Y | X, S = s_2)$ to match the same set of quantiles of $P(Y | X, S = s_1)$ .
The notation $P(Y | X, S = s_2)$ means - the probability of the outcome Y given X and the variable S is $s_2$.
My question is, how would I go on creating a linear function be able to transform like this?
The quantile distributions look like this:
The last 10 quantile distributions look like this for both groups:
I've tried interpolation but that doesn't seem right, as it's not a linear function, and linear functions transform part of the data in negative values, which are impossible in probability.
What am I misunderstanding here?
More info, here's the empirical conditional probability distribution of both variables:




I spoke with the authors and what is referred is a linear interpolation function, I used the Python Library
scipy.interpolation.interp1dand ended up with the following result on right: