This is a reinterpretation of my old question Fit data to function $g(t) = \frac{100}{1+\alpha e^{-\beta t}}$ by using least squares method (projection/orthogonal families of polynomials). I need to understand things in terms of orthogonal projections and inner products and the answers were for common regression techniques.
t --- 0 1 2 3 4 5 6
F(t) 10 15 23 33 45 58 69
Adjust $F$ by a function of the type $$g(t) = \frac{100}{1+\alpha e^{-\beta t}}$$ by the discrete least squares method
First of all, we cannot work with the function $g(t)$ as it is. The way I'm trying to see the problem is via projections.
So let's try to transform the problem like this:
$$\frac{100}{g(t)}-1 = \alpha e^{-\beta t}\implies \ln \left(\frac{100}{g(t)}-1\right) = \ln \alpha -\beta t$$
Since we want to fit the function to the points, we want to minimize the distance of the function from the set of points, that is:
$$\min_{\alpha,\beta} \left(\ln\left(\frac{100}{g(t)}-1\right)-\ln\alpha + \beta t\right)$$
Without using derivative and equating things to $0$, there's a way to see this problem as an orthogonal projection problem.
I know I need to end up with something like this:
$$\langle \ln\left(\frac{100}{g(t)}-1\right)-\ln\alpha + \beta t, 1\rangle = 0\\ \langle \ln\left(\frac{100}{g(t)}-1\right)-\ln\alpha + \beta t, t\rangle=0$$
And I know this comes from the knowledge that our minimum is related to some projection and this projection lives in a space where the inner product with $span\{1, t\}$ (because of $\ln\alpha,\beta t$), gives $0$.
In order to end up with
$$\begin{bmatrix} \langle 1,1\rangle & \langle t,1\rangle \\ \langle 1,t\rangle & \langle t,t\rangle \\ \end{bmatrix} \begin{bmatrix} \ln \alpha \\ -\beta \\ \end{bmatrix}= \begin{bmatrix} \langle \ln\left(\frac{100}{g(t)}-1\right) , 1\rangle \\ \langle \ln\left(\frac{100}{g(t)}-1\right) , t\rangle \\ \end{bmatrix}$$
Where the inner product is
$$\langle f,g\rangle = \sum f_i g_i $$
*why?
Can someone tell me what reasoning gets me to the inner products above, if I did everything rigth and how to finish the exercise?
Linear regression is linear algebra in disguise.
You are searching for a function $$l(t)= c_1 +c_2t$$ (where in your case $c_1= \ln \alpha$ and $c_2=-\beta$), that is a linear combination of functions $v_1(t)=1$ and $v_2(t)=t$. Your goal is to minimize $$e(l,h)=\sum (l(t_i)-h(t_i))^2$$ (where in your case $h(t)=\ln \left(\frac{100}{g(t)}-1 \right)$).
The "sum of squares" formula is suggestive of Pythagoras theorem/norm on some vector space. We want to view $e(l,h)$ as a square of distance on, say, the vector $F$ space of functions $f: \mathbb{R}\to\mathbb{R}$, coming from the dot product
$$<f,g>=\sum_i f(t_i) g(t_i)$$
(Recall that square distance between two vectors in a vector space with a dot product is $d(u,v)^2=<u-v, u-v>$, so we recover $e=d^2$ from the dot product above.)
A slight problem is that on this vector space of functions $F$ the "distance" $d(l,h)=\sqrt{e(l,h)}$ is not really a distance, since it vanishes as soon as $l(t_i)=h(t_i)$ for all $i$ (in math-speak we get only a pseudometric, not a metric). We can either ignore this, or use the standard solution which is to work on the quoutient space $V=F/F_0$ of functions modulo subspace $F_0=\{f: \mathbb{R}\to\mathbb{R}| f(t_i)=0\}$ -- the ones that are "distance zero from the origin". This has an advantage that $V$ is now a finite dimensional vector space (of dimension equal to the number of data points), so we can be more confident using standard linear algebra. Note that $V$ has the dot product $<f,g>=\sum_i f(t_i) g(t_i)$.
In any case, we are now looking for a function $l(t)= c_1 +c_2t$ that is closest to $h(t)$ in the sense of the Euclidean distance $d$, that is a point in subspace spanned by $1, t$ (in $F$, or more precisely by their equivalence classes in $V$). We can forget all the complicated setup, and just think: given a point $h$ and a plane spanned by two vectors, how do we find a point $l$ in the plane closest to $h$? Of course we must project $h$ onto the plane! That is, $l$ must be such that $h-l$ is orthogonal to the plane, meaning orthogonal to both spanning vectors. Thus, we are looking for $l=c_1+tc_2$ such that $<h-l, 1>=0$ and $<h-l, t>=0$ (where the dot product is still $<f,g>=\sum_i f(t_i) g(t_i)$). These are the equations in your question.
Now you just need to solve them. To do so, plug in $l=c_1+c_2 t$ and rewrite the equations as
$<h,1>=c_1<1,1>+c_2<1,t>$
$<h,t>=c_1<1,t>+c_2<t,t>$
This is a linear system with 2 equations and 2 unknowns, which you can write as the matrix equation -- the one you have in the question.
To finish the exercise just compute all the dot products (for example in your case $<1,1>=\sum_i 1 \cdot 1=7$, $<1,t>=\sum_i 1 \cdot i=0+1+\ldots+6=21$, $<t,t>=91$, $<h, 1>=\sum_{i=0}^6 h(i)$, $<h, t>=\sum_{i=0}^6 h(i) \cdot i$) and solve the 2 by 2 linear system by whatever method you like (Gaussian elimination, or multiplying by $\begin{bmatrix}7&21\\21&91\end{bmatrix}^{-1}=\frac{1}{196}\begin{bmatrix}91&-21\\-21&7\end{bmatrix}$, or even the Cramer's rule that Yuri used in another answer). You will get $c_1= \ln \alpha$ and $c_2=-\beta$, and hence can solve for $\alpha$ and $\beta$ as well.