Finding transformation function for a distribution that looks like exponential

47 Views Asked by At

Suppose that we have two data sets, R and P. R is larger than or equal to P. R can be negative or positive. P can be negative or positive. So in all cases (R-P) is positive or zero. the R and P are dollars and the scale of them is high likes 1,000,000,0 or 500,000,000. Now I want create a criterion looks like :

E = 1 - ( 1 / 1+(R-P) )

So I have E that is between 0 and 1. I want E scale between 0 and 1. But the problem is this. (R-P) is has high values,so all outputs of this criterion is near 1. I revise this function, Now i have this :

E = 1 - ( 1 / 1+ ( (R-P)/mean(R-P) ) )

Now because of mean(R-P) we have better scaling between 0 and 1. This is distribution of R-P for more help :

enter image description here

After doing that I want classify data with this E . For example samples that have a E greater than 0.7 are in class 1 and samples with E lower than 0.7 are in class 2 (In a binary classification problem). For a better distribution in final version of this criterion i will remove outliers of (R-P) with this function to have more robust criterion :

abs(X-mean(X)) >=2*std(X)

(X here is (R-P)

After removing outliers in R-P i will calculate mean(R-P). What is your recommendations for better transformation of this problem? I have a theoretical problem with using mean in this criterion so i need comments about this.

Thanks.

Ps. We have a situation here:

We should have this behavior in this criterion:

  • if P=constant and R=increase >> E=increase
  • if R=constant and P=increase (towards negative) >> E=increase
  • if P=constant and R=decrease >> E=decrease
  • if R=constant and P=decrease(towards positive) >> E=decrease

  • a sample that have highest (R-P) should have highest E (1 or near 1)

  • a sample that have lowest (R-P) should have lowest E (1 or near 1)