Least squares regression with two predictor variables (exponential functions of time)

157 Views Asked by At

Question cropped from textbook (Apologies for the link- I don't have enough rep to post the actual image.) [Now pasted below. Ed.]

enter image description here

I've come across a question in a textbook (linked above) requiring a least-squares fitted model of a sum of exponential terms. I have some experience using the least-squares criterion with single terms, ie fitting a curve $y = Ae^x$ to some data, but none with sums of terms. Would it suffice to fit separate curves to each term?

I'm not at all confident in my approach here, thanks for any help.

2

There are 2 best solutions below

0
On BEST ANSWER

You have received excellent advice in the Comments and an earlier Answer. Maybe it will help you to understand what they are suggesting and doing, if you see this done in a more familiar format.

I have chosen to use Minitab software because it has clearly labeled output. To start I put amount (y) and time (t) from the data table (in your link) into two columns of the Minitab worksheet. At the beginning of the Minitab session, I print these out for reference.

 MTB > print c1 c2

 Data Display 

 Row    y  t
   1  8.8  5
   2  8.6  6
   3  8.2  7
   4  7.9  8

 MTB > name c3 'x1' c4 'x2'
 MTB > let  'x1' = exp(-.03*'t')
 MTB > let  'x2' = exp(-.05*'t')
 MTB > print c1-c4

 Data Display 

 Row    y  t        x1        x2
   1  8.8  5  0.860708  0.778801
   2  8.6  6  0.835270  0.740818
   3  8.2  7  0.810584  0.704688
   4  7.9  8  0.786628  0.670320

Then I do linear regression of y on two columns x1 and x2 defined as suggested in the helpful Comment by @Jeam-ClaudeArbaut. I do regression without a constant term ('forced through the origin') because your model has no constant term.

 MTB > Regress 'y' 2 'x1' 'x2';
 SUBC>   NoConstant;            # force through origin
 SUBC>   Brief 2.               # amount of detail in printout

Minitab results agree with those in the excellent Answer by @LorenLugosch. (+1) [If there is a lot of regression your future, please make it your goal to understand that approach.]

 Regression Analysis: y versus x1, x2 

 The regression equation is
 y = 6.66 x1 + 4.00 x2          # more accurate coef's in table


 Predictor    Coef  SE Coef     T      P
 Noconstant
 x1          6.657    1.763  3.78  0.063
 x2          3.999    2.003  2.00  0.184

 S = 0.0647461                 # Square of this is MS(Resid Err)

 Analysis of Variance

 Source          DF      SS      MS         F      P
 Regression       2  281.04  140.52  33520.73  0.000
 Residual Error   2    0.01    0.00
 Total            4  281.05


 Source  DF  Seq SS
 x1       1  281.02
 x2       1    0.02
4
On

If we call the data which we measured $z$, we can model $z$ as $y + w$, where $w$ is some noise. We can rewrite $y = Ue^{-0.03t} + Ve^{-0.05t}$ as $y = H\theta$, where $H$ is $$ \begin{matrix} e^{-0.03*5} & e^{-0.05*5}\\ e^{-0.03*6} & e^{-0.05*6}\\ e^{-0.03*7} & e^{-0.05*7}\\ e^{-0.03*8} & e^{-0.05*8}\\ \end{matrix} $$ and $\theta = [\begin{matrix} U & V \end{matrix}]$. The best estimate of $\theta$ is $(H^TH)^{-1}H^Tz$. ($(H^TH)^{-1}H^T$ is called the pseudo-inverse of $H$: https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_pseudoinverse). If we do the linear algebra, we get that the estimate of $[\begin{matrix} U & V \end{matrix}]$ is equal to [6.6571 3.9994].

(There might be a less complicated way to do this, but this is just the mechanical way I learned recently.)