Linear equations applications

89 Views Asked by At

I'm reading Jane Eyre in class and we have to do questions. However, there are page numbers in the questions that relate to a different copy of the novel that I have. I was wondering if I could create a formula to translate the page number in the questions to the page number in my book.

Here are the page numbers of the chapters from both books here is a table of the numbers i used

I tried to do this with a simultaneous linear equation and ended up with the formula 5/6x + 31.5

this worked with the first 2 chapters as you can see with the screenshot but after that, the number got further away.

Could someone help me to make a formula? thanks for all your help!

4

There are 4 best solutions below

0
On

We have no reason to expect that there will be a simple formula to exactly match the page numbers. While the "size" of a page in each edition should be fairly consistent, varying line breaks and the like mean that there will be plenty of small irregularities. As such, we should be looking for approximations.

Two approaches:

  • Linear regression. Take all of our page numbers, and look for the best-fit line. With the given data, that's $\text{(new)}=a\cdot\text{old}+b$, for $a\approx 31.6$ and $b\approx 0.844$. This results in an estimate that's off by less than $1$ at every chapter start. A couple of them are off by enough to round wrong; this estimates the start of chapter $3$ as page $50$ instead of $51$, and the start of chapter $9$ as page $108$ instead of $107$. Overall, though, it should always be pretty close.
  • Piecewise linear fitting. We have a bunch of points that we know are right - so let's just interpolate between them. This leads to a different formula for each chapter; in chapter $4$, for example, we're between pages $33$ and $49$ in the other edition or between pages $59$ and $73$ in yours, for an estimate of $\frac{14}{16}(x-33)+59=\frac78x +30+\frac18$. The drawback for this one? More formulas to juggle, and also we need one more data point for the end of the book to account for page numbers in chapter $12$.

Either way, you should be able to get it within a page or so everywhere.

0
On

Here is a hard proof that it cannot be done exactly using a linear equation together with a consistent rounding strategy. Consider:

enter image description here

Notice that for pages 22 and 81 from the original book, the formula gives exactly 1 page below the target value, so we would need to round up to get to the target value (we could tweek the formula so it adds a super small value, in which case you could say that we round up anything to the next whole number). But note that for page 33 from the original book, we are already above the target value ... so any rounding convention cannot get to the target value there.

Moreover, we can't push this curve any lower; the $(22,50)$ and $(81,99)$ points are as low as we can get while still making these points get to their target value.

So, any linear equation with a consistent rounding convention can never match your desired values.

0
On

As already noted, no equation of the form $y=ax+b$ will give the exactly correct page number in every case. This can happen because on edition of a book may end a chapter with almost a full page of text while the same chapter in a different edition, or a different chapter in the same edition, ends in an almost blank page with two lines of text. So while you could accurately compare locations in both editions by counting words, there are likely to be seemingly random “jumps” in the page count relative to the word count.

If you counted the words on a page in each book (which is not perfectly comparable since one page might have longer words, but is likely to be more linear than pages per chapter) and applied this ratio to the pages within each chapter (getting the constant term in the equation for each chapter using your table and an estimate of the blank space at the start of each chapter) you might have a chance to make accurate correspondences between pages most of the time.

If you want something as simple as your attempt but more accurate, it appears that in your approach you took the starting pages of the first two chapters and extrapolated a formula over the later chapters. In general, you will usually get better results from interpolation than from extrapolation. To obtain interpolated page numbers, ideally you would take the first and last page numbers of the story in each book—the first page of chapter 1 and the last page of chapter 12—and derive your equation from those numbers. Given only the numbers you have provided in the table, one could base an equation on the starting pages of chapters 1 and 12, which would extrapolate over chapter 12 but would interpolate over the other chapters.

0
On

As already said in comments and answers, you can have a better linear regression. Working using whole numbers in the regression, I got $$y=\frac{6420676}{203043}+\frac{57106 }{67681}x$$ and below are the results $$\left( \begin{array}{ccc} x & y & y_{calc} \\ 9 & 39 & 39.22 \\ 15 & 44 & 44.28 \\ 22 & 51 & 50.18 \\ 33 & 59 & 59.47 \\ 49 & 73 & 72.97 \\ 63 & 85 & 84.78 \\ 71 & 92 & 91.53 \\ 81 & 100 & 99.97 \\ 90 & 107 & 107.56 \\ 99 & 115 & 115.15 \\ 111 & 125 & 125.28 \\ 128 & 140 & 139.62 \end{array} \right)$$ which corresponds to $R^2=0.999848$ which is very good.

We would have slightly worse results trying to minimize $$\sum_{i=1}^{12} \left(\text{Round}[A+B x_i]-y_i \right)^2$$ This a not a very pleasant numerical work but it leads to $A=31.0705$ and $B=0.849069$ which are slightly different from those obtained by the simple linear regression. This would lead to the following results $$\left( \begin{array}{ccc} x & y & \text{Round}[A+B x] \\ 9 & 39 & 39 \\ 15 & 44 & 44 \\ 22 & 51 & \color{red}{50} \\ 33 & 59 & 59 \\ 49 & 73 & 73 \\ 63 & 85 & 85 \\ 71 & 92 & \color{red}{91} \\ 81 & 100 & 100 \\ 90 & 107 & 107 \\ 99 & 115 & 115 \\ 111 & 125 & 125 \\ 128 & 140 & 140 \end{array} \right)$$