How to estimate of coefficients of logistic model

244 Views Asked by At

Consider model $logit(p)=a+bx$. I would like to get a analytic formula of $a$ and $b$ like in linear regression. In linear regression, we can get a formula of estimates of $a$ and $b$.

I tried using MLE to estimate it. But it is too complicated for me.

2

There are 2 best solutions below

0
On BEST ANSWER

For estimate the values of $a$ and $b$ in your model:

$$logit(p)=a+bx^{(i)}$$

For simplify you can consider that the $a$ is multiplying by $x^{(0)}$ with value $1$, and use the matrix notation.

$$ Z = \theta\cdot x$$

In logistic regression you can use the sigmoid function as below.

$$ h_\theta (x) = \dfrac{1}{1 + e^{\theta^T.x}} $$

Now we need define one Cost Function $J(\theta)$, the MSE(Mean Square Error) is a function very used for it.

$$J(\theta) = \dfrac {1}{2m} \Big[\displaystyle (h_\theta (x) - y)^T (h_\theta (x) - y) \Big]$$

The update the values for $\theta$ is using gradient descent is define by:

$$ \theta = \theta - \gamma \dfrac{dJ(\theta)}{d\theta} $$

For calculate the gradient $\dfrac{dJ(\theta)}{d\theta}$ is used the chain rule:

$$ \dfrac{dJ(\theta)}{d\theta} = \dfrac{dJ(\theta)}{dh_\theta (x)}\dfrac{h_\theta (x)}{dZ}\dfrac{dZ}{d\theta} $$

The derivative of MSE $\dfrac{dJ(\theta)}{dh_\theta (x)}$:

$$\dfrac{dJ(\theta)}{dh_\theta (x)} = -\dfrac{1}{m}(h_\theta (x) - y)$$

Considering that the derivative of the Sigmoid Function is:

$$ \dfrac{h_\theta (x)}{dZ} = h_\theta (x)\odot(1-h_\theta (x)) $$

And the $\dfrac{dZ}{d\theta} = x$, so the final update function is:

$$ \theta = \theta + \dfrac{\gamma}{m} x^T \cdot \Big[(h_\theta (x) - y) \odot h_\theta (x)\odot(1-h_\theta (x))\Big] $$

2
On

Start from defintion $$\text{logit}(p)=\log(\frac p{1-p})$$ So, if you have data $(x_i,p_i)$, compute $$y_i=\log(\frac {p_i} {1-p_i})$$ and the regression is just $$y=a+bx$$ So, standard linear regression will give you estimates of parameters $a,b$.

But, if $p$ is the mesured value, you need to go to nonlinear regression for the model $$p=\frac{e^{a+b x}}{1+e^{a+b x}}$$ because what is measured is $p_i$ and not $\log(\frac {p_i}{1-p_i})$.

If you have a nonlinear regression program, the problem would be simple since the first step gave you at least reasonable estimates of parameters $a,b$.

If you do not have this, start from definition. You want to minimize $$SSQ=\sum _{i=1}^n \left(\frac{e^{a+b x_i}}{1+e^{a+b x_i}}-p_i\right)^2$$ Then $$\frac {d\, SSQ}{da}=\sum _{i=1}^n \frac{1}{1+\cosh (a+b x_i)}\left(\frac{e^{a+b x_i}}{1+e^{a+b x_i}}-p_i\right)$$ $$\frac {d\, SSQ}{db}=\sum _{i=1}^n \frac{x_i}{1+\cosh (a+b x_i)}\left(\frac{e^{a+b x_i}}{1+e^{a+b x_i}}-p_i\right)$$ and you could use Newton Raphson method to solve $$\frac {d\, SSQ}{da}=\frac {d\, SSQ}{db}=0$$ In such a case, to make life simpler, I would suggest numerical derivatives.

To give an example, I generated eleven data points according to $$p=\frac{1}{10} \left\lfloor \frac{10\, e^{0.234+0.456 x}}{1+e^{0.234+0.456 x}}\right\rfloor$$ in order to have some significant noise in the data (the $x_i$ are integers between $-5$ and $+5$).

The first step leads to $y=-0.0737209+0.443856 x$; starting with these values as inital guesses for the nonlinear regression led to $$p=\frac{e^{-0.0343898+0.438486 x}}{1+e^{-0.0343898+0.438486 x}}$$ As you can see, the parameters significantly changed going from first to second step.