In a calculus textbook by late James Stewart I encountered an exercise to find a power function, as a mathematical model approximately representing data:
The table shows the number N of species of reptiles and
amphibians inhabiting Caribbean islands and the area A of
the island in square miles.
a) Use a power function to model N as a function of A.
Island A N
Saba 4 5
Monserrat 40 9
Puerto Rico 3,459 40
Jamaica 4,411 39
Hispaniola 29,418 84
Cuba 44,218 76
The confusing thing is that in the previous pages of the textbook there were no explanation of how to do it, except using special software like Mathematica© and only for a linear model.
The correct answer from the textbook is: $N = 3.1046 A^{0.308}$
Could anyone explain to me, please, how this answer was obtained?
I myself found another answer $N = A^{-2.47295*10^{-6} * A + 0.504732}$
by finding $log_A N$ for every item in the table (except the first one, which looked far out of order) and then finding the linear regression for these values with Mathematica©'s tool Fit (which I took as the exponent value for my function):
Island A N(real) N(the textbook's func.) N(my func.)
Saba 4 5 4.75817 2.01314
Monserrat 40 9 9.6703 6.43358
Puerto Rico 3,459 40 38.1946 57.0098
Jamaica 4,411 39 41.1645 63.0608
Hispaniola 29,418 84 73.848 85.1851
Cuba 44,218 76 83.724 68.6738
The right answer definitely looks more close to the reality, so could anyone, please, explain to me, how it was obtained?
You have to log-linearize your relation, i.e.
$N = \alpha A^\beta \iff \ln N =\ln\alpha +\beta \ln A$
If one considers a $6\times2$ matrix $\boldsymbol{X}$ which horizontally stacks a $6\times1$ vector of $1$s with a $6\times1$ vector of $\ln A$, know that computing $(\boldsymbol{X}^{'}\boldsymbol{X})^{-1}\boldsymbol{X}^{'}\ln N$ yields the following $2\times1$ hyper-coefficient $(\ln \alpha, \beta)'$. In matrix terms,
$\ln N = \boldsymbol{X}(\ln \alpha, \beta)' + \boldsymbol{\varepsilon}$
where $\boldsymbol{\varepsilon}$ is a $6\times1$ vector of errors ($0$-centered thanks to the implication of $\ln \alpha$ in the estimation. Doing so yields $e^{\ln \alpha} = \alpha = 3.10462040170919$ and $\beta=0.308044235476563$.
Note that $\boldsymbol{\varepsilon}$ is conceptually not due to the quality of your approximation approach, but is due to the intrinsic stochastic nature of the phenomenon of study. A contrario the vector of residuals, $\widehat{\boldsymbol{\varepsilon}}$, as it stands in $\widehat{\ln N} - \boldsymbol{X}\widehat{(\ln \alpha, \beta)'} = \widehat{\boldsymbol{\varepsilon}}$, entails both $\boldsymbol{\varepsilon}$ and something more which depends on the estimation procedure.