Using linear regression with binary and categorical variables.

146 Views Asked by At

I have a dataset with a binary variable: "Religious" with $0$ being "no" and $1$ being "yes". And a categorical variable "Contraceptive" having the values $1,2$ and $3$. Where $1$ = no use, $2$ = short term use and $3$ = long term use. The original dataset is something like this:

\begin{align} &\text{Religious}\:\:\: \text{Contraceptive}\\ &1 \hspace{42pt} 1 \\ &1 \hspace{42pt} 1 \\ &0 \hspace{42pt} 2 \\ &0 \hspace{42pt} 3 \\ &\vdots \end{align}

then I created three other columns and changed the contraceptive column to:

\begin{align} &\text{no use} \:\:\: \text{short} \:\:\: \text{long}\\ &1 \hspace{28pt} 0 \hspace{28pt} 0 \\ &1 \hspace{28pt} 0 \hspace{28pt} 0 \\ &0 \hspace{28pt} 1 \hspace{28pt} 0\\ &0 \hspace{28pt} 0 \hspace{28pt} 1\\ &\vdots \end{align}

Is now possible to use a simple linear regression with these variables? How does one do that? I'm using R.

1

There are 1 best solutions below

4
On BEST ANSWER

No. Your dependent variable is binary, i.e., $\{0,1\}$, hence a possible model is a logistic regression, i.e., $$ \widehat{ \mathbb{P}( Y_i = 1) } = \frac{1} { 1 + \exp\{ -(\beta_0 + \beta_1\text{short} + \beta_1 \text{short} ) \}}, $$ in R

glm( formula = Religious ~ short + long, family = binomial(link = "logit") )