Regression with many discrete and continuous predictors and few rows

121 Views Asked by Bumbble Comm At 01 Apr 2026 - 3:12

I want to do regression on a dataset. It has one continuous dependent variable that I want to predict. It has many categorical and some continuous predictors. It only has a few rows.

A simplified example might look like this (Dx means discrete predictor, Cx means continuous predictor, IV is the independent variable I'm trying to predict.):

+----+-------+---------+-----+---------+---------+-----+-----+
| IV |  D1   |   D2    | D3  |   D4    |   D5    | C1  | C2  |
+----+-------+---------+-----+---------+---------+-----+-----+
|  5 | 1,2,3 | 1,2     | 1   | 1       | 1       | 2   | 3.5 |
|  4 | 2,3,4 | 1,3     | 2   | 2,3     | 1,2     | 2.5 | -2  |
| -2 | 1,3,5 | 2,3     | 3,4 | 2       | 1,3     | 2.2 | 50  |
|  4 | 4,6,7 | 4,5,6,7 | 4,5 | 3,4,5,6 | 1,2,3,4 | 2.2 | 50  |
+----+-------+---------+-----+---------+---------+-----+-----+

You can see that the categorical variables can have multiple discrete values for each row, so I can dummy code it like this to simplify the data:

+----+------+------+------+------+------+------+------+-----+
| IV | D1-1 | D1-2 | D1-3 | D1-4 | D1-5 | D1-6 | D1-7 | ... |
+----+------+------+------+------+------+------+------+-----+
|  5 |    1 |    1 |    1 |    0 |    0 |    0 |    0 | ... |
|  4 |    0 |    1 |    1 |    1 |    0 |    0 |    0 | ... |
| -2 |    1 |    0 |    1 |    0 |    1 |    0 |    0 | ... |
|  4 |    0 |    0 |    0 |    1 |    0 |    1 |    1 | ... |
+----+------+------+------+------+------+------+------+-----+

Now the key is: I can't do multiple linear regression because the number of rows is less than the number of columns, and I don't want to remove any of the predictors.

One way (Method A) I thought could work is to do individual linear regression for each predictor and combine the linear models into one. However I don't really know how to combine them. One possible way is to use the correlation coefficient (r^2/sum of all r^2) as the weight for each predictor, but I don't know whether there's a better way.

Another way (Method B) - I feel is better - is to use individual linear regression for the continuous variables, and just a mean/variance analysis for the discrete variables (like ANOVA), then combine the models using the some transform of the standard variation f(sigma) as the weight for each discrete variable, and f(r^2) as the weight for the continuous variable.

Therefore, my questions are:

If I use method B, what should be the f(sigma) and f(r^2)? (How should I combine the individual models?)
Is there a better way?

Original Q&A

Regression with many discrete and continuous predictors and few rows

Related Questions in STATISTICS

Related Questions in REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions