I'm trying to analyze my sleep using regression analysis. Each night is rated (dependent variable). I'm trying to explain this rating with, for example, my sleep duration and each night's bed time's deviation from a moving average of bed times over n periods (independent variables).
The maximum rating I can give is 5. From a logical point of view, a longer sleep duration does not necessarily result in a higher sleep rating (=> better sleep) so I assume that there's a bell shaped kind of curve with the peak being the ideal sleep duration everything else held equal.
My understanding of a linear regression is that "more is better" and that a polynomial regression gives me a formula that maps this bell shaped curve more accurately. Is that correct?
If so, I'm guessing that I need to run a polynomial regression. Since I've got more than one independent variable, I need a multivariate polynomial regression. My understanding of this is that a polynomial regression takes only one independent variable. Is my understanding correct? What's the best way to go about this and how would others approach this problem? How do I restrict the dependent variable to a value [0..5] in the resulting regression formula?
Per request, here's a sample set of my data:
**Rating *Sleep duration *Bed time dev. *Wake up time dev.
3,5 7,033333333 0 0
2 5,533333333 -0,021527778 -0,516666667
1,5 5 -0,044907407 -1,077777778
1 6,9 0,016319444 0,391666667
2,5 8,966666667 0,843055556 1,033333333
3 7,516666667 -0,057291667 2,625
2,5 8,033333333 -0,062797619 1,921428571
Since a subjective rating usually does not have cardinal meaning, but rather an ordinal meaning (higher score is better, but the difference $3-2$ is not necessarily equal to the difference $2-1$), I would recommend using Ordinal Logit instead of linear regression. With either model you can use a, say, second degree polynomial of your sleep duration as explanatory variable, along with bed time deviation (and probably also autoregressive terms, i.e., how much you slept the night before to control for "tiredness").
In general, you are right, you need at least a second order polynomial in your sleep time in order to capture a nonlinear effect on your rating. With only the linear effect, sleep time will be estimated to have either a strictly increasing effect, strictly decreasing effect, or no effect at all.
Try something like $$\text{Rating}_t=f(\alpha+\beta_1 \text{Duration}_t+\beta_2 \text{Duration}_t^2+\beta_3 \text{Duration}_{t-1}+\gamma \mathbf{X}),$$ where $f$ is the ordinal logit function (appropriate statistics programs have this pre-programmed), $\beta_3$ is the effect of previous night's sleep duration and $X$ is a vector of other explanatory variables. In particular, you can also try and include Rating$_{t-1}$ (but then exclude Duration$_{t-i}$, since it is collinear with it). I would expect sleep to be better after a night with bad sleep.