How to correct the output when using LPM (Linear Probability Method)?

45 Views Asked by At

I have a .csv with 10 inputs and 1 output. The "problem" is that the output is binary ("0" or "1"). I am using multiple linear regression to predict the output. From searching I have found that this is called LPM (Linear Probability Model). The problem is that the y_predicted does not fall inside the [0,1] range every time, but I can get values such as -0.3 or +1.2 etc. I know that there exists the solution of logistic regression, but I do not want that, I have tested the Logistic Regression and works. But, what I ask is to use LPM and make it work, making it (the LPM) mathematically compliant. Because as it is now, I don't feel that outputs likes -0.3 or 1.2 etc are mathematically correct since probabilities must fall in the range [0, 1]. So, the outputs I get from LPM model needs some correction. What do you suggest? How to proceed?

1

There are 1 best solutions below

0
On

Mathematical Problem :
When you have variables involved in some linear or Continuous calculations , & somehow you get only $0$ or $1$ , then when a variable (eg $x_1$) changes by a small amount (eg $\delta x_1$) , the Continuous Output (eg $y$) will have to vary by a small amount $\delta y$ , hence it will no longer be $0$ or $1$ , it will be $0 \pm \delta y$ or $1 \pm \delta y$ : there is no way out.

Mathematical Solution :
You have to use non-linear or Discrete calculations. In general , there is no other way.
There are many ways to achieve that Binary Output with non-linear or Discrete calculations.
One way is to use SVM or Decision Boundary or what-ever you wish to use to calculate the Intermediate Output $Y$ which is Continuous.
Then calculate the final output $y=sign(Y)$ or $y=sign(Y-Bias)$ or Etc , to get $-1$ or $+1$ , which we can easily convert to $0$ or $1$.

One more way is to convert the Intermediate Output $Y$ to a number between $-1$ & $1$ (eg with Exponentiation+logarithmic functions or inverse trigonometric functions or hyperbolic trigonometric functions) & then round up the numbers to nearest value & finally convert to $0$ or $1$ to get final Output $y$.

Always , we must use non-linearity or Discretization at the final calculations to get Discrete Output.

[[ You might be able to get more targeted Answers if you include the Inputs & Outputs to your Post ]]