I want to do regression on a dataset. It has one continuous dependent variable that I want to predict. It has many categorical and some continuous predictors. It only has a few rows.
A simplified example might look like this (Dx means discrete predictor, Cx means continuous predictor, IV is the independent variable I'm trying to predict.):
+----+-------+---------+-----+---------+---------+-----+-----+
| IV | D1 | D2 | D3 | D4 | D5 | C1 | C2 |
+----+-------+---------+-----+---------+---------+-----+-----+
| 5 | 1,2,3 | 1,2 | 1 | 1 | 1 | 2 | 3.5 |
| 4 | 2,3,4 | 1,3 | 2 | 2,3 | 1,2 | 2.5 | -2 |
| -2 | 1,3,5 | 2,3 | 3,4 | 2 | 1,3 | 2.2 | 50 |
| 4 | 4,6,7 | 4,5,6,7 | 4,5 | 3,4,5,6 | 1,2,3,4 | 2.2 | 50 |
+----+-------+---------+-----+---------+---------+-----+-----+
You can see that the categorical variables can have multiple discrete values for each row, so I can dummy code it like this to simplify the data:
+----+------+------+------+------+------+------+------+-----+
| IV | D1-1 | D1-2 | D1-3 | D1-4 | D1-5 | D1-6 | D1-7 | ... |
+----+------+------+------+------+------+------+------+-----+
| 5 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | ... |
| 4 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | ... |
| -2 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | ... |
| 4 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | ... |
+----+------+------+------+------+------+------+------+-----+
Now the key is: I can't do multiple linear regression because the number of rows is less than the number of columns, and I don't want to remove any of the predictors.
One way (Method A) I thought could work is to do individual linear regression for each predictor and combine the linear models into one. However I don't really know how to combine them. One possible way is to use the correlation coefficient (r^2/sum of all r^2) as the weight for each predictor, but I don't know whether there's a better way.
Another way (Method B) - I feel is better - is to use individual linear regression for the continuous variables, and just a mean/variance analysis for the discrete variables (like ANOVA), then combine the models using the some transform of the standard variation f(sigma) as the weight for each discrete variable, and f(r^2) as the weight for the continuous variable.
Therefore, my questions are:
- If I use method B, what should be the f(sigma) and f(r^2)? (How should I combine the individual models?)
- Is there a better way?