is slope of line of best fit through points the same as y intercept of regression through their derivatives

39 Views Asked by At

I'm struggling to do proof I'm trying to validate for a decision on how I calculation erosion rates. It is computationally much more efficient for me to calculate erosion month to month, than it is to redo a regression of shoreline regression every month (assuming monthly data).

In the top graph, I plot shoreline position against some time t_n. A regression through these points will have some slope which is the erosion rate of that part of a shoreline. In the following image, I instead take the slope point to point, and fit a regression through that. I would prefer to do this, as calculating the shoreline position against some time t_n means that every month I get new data, I have to recalculate the shoreline position against that point for every month, versus calculating one additional data point, and finding its average.

enter image description here

I added the limit to the bottom function to stress the point that in a situation of erosion occurring around a particular rate, the m value should be 0 in that equation.

Logically, this makes sense to me... when I'm trying to write a proof for this, I really don't know where to begin.

Is it enough to simply say

y = mx + b

dy/dx = m --> the constant of the regression fit on the individual slopes?

If it isn't the same, it could be due to optimizing against the error in shoreline position versus optimizing the error of the rate itself, maybe proving that is the same (or isn't) is sufficient?

1

There are 1 best solutions below

0
On BEST ANSWER

Let us start with the least squares fit. We want to minimize $$ \chi^2 = \sum_j \left(y_j - m x_j -b\right)^2 $$

The result is: $$ m = \frac{\sum_j y_j \sum_j x_j- \sum_j y_j x_j }{\left(\sum_j x_j\right)^2 - \sum_j x_j^2 \sum_j 1} \,\,\,\,\,\,\,\,\,\,\,\,\,\, (1)$$

This is different from the average of the finite differences: $$ m' = \frac{\sum_j \frac{y_{j+1}-y_j}{x_{j+1} - x_j} }{n} $$

It is particularly easy to see this if the $x_j$ are equally spaced: $$ m' = \frac{\frac{\sum_j \left( y_{j+1}-y_j\right)}{x_2 - x_1} }{n} $$ Calculating the sum: $$ m' = \frac{y_n-y_1}{n\left(x_2 - x_1\right)} = \frac{y_n-y_1}{x_n - x_1} $$ This $m'$ only depends on the first and last values of $y_j$. Intuitively, this evaluation is less precise than the evaluation through Eq. 1, which takes into consideration the whole set of data.