I was trying to understand why the method works (specifically for square roots, at first) and I ended up crafting what is by my standards a fairly convincing argument, but of course it's probably not at the level of mathematical proof standards so I'd really appreciate some feedback on how my "proof" can be improved, the ways in which it is not valid, etc...
Let $g >= 0$ be our initial guess, and $x >= 0$ the number whose square root we are looking for.
We separate our proof into $3$ cases.
Case 1: $g = \sqrt{x}$. $\quad$Then, $\frac{\big(\frac{x}{g} + g \big)}{2} = \frac{\big(\frac{x}{\sqrt{x}} + \sqrt{x}\big)}{2} = \sqrt{x}$
Case 2: $g < \sqrt{x}$. $\quad$Let $g = \sqrt{x} - \epsilon$, and let $h = \frac{x}{g}$. $\quad$Then $g \cdot h = x$. $\quad$But we know that $(\sqrt{x} - \epsilon) (\sqrt{x} + \epsilon) = g (\sqrt{x} + \epsilon) = x - \epsilon^2 < x$.
Therefore it follows that $h > \sqrt{x} + \epsilon$.
Since $\sqrt{x} - \epsilon + \sqrt{x} + \epsilon = 2 \sqrt{x} \quad$ and $\quad h > \sqrt{x} + \epsilon$, $\quad$ then $\quad g + h = g + \frac{x}{g} > 2\sqrt{x}$.
This shows that our new guess, namely $\frac{\big(\frac{x}{g} + g \big)}{2}$, is always going to be $> \sqrt{x}$, which leads to our third case.
case 3: $g > \sqrt{x}$. $\quad$Here we want to show that our new guess at each step is always $< g$, our current guess, and also $> \sqrt{x}$.
We have that $g^2 > x \implies \frac{x}{g} < g \implies \frac{x}{g} + g < 2g \implies \frac{\big(\frac{x}{g}+ g \big)}{2} < g$
By letting $g = \sqrt{x} + \epsilon$ and again $h = \frac{x}{g}$, we can show, using the same logic as for Case $2$, that our new guess is also going to be $> \sqrt{x}$.
Since each guess is less than the previous guess while still being $> \sqrt{x}$, the limit of our successive guesses as the number of iterations goes to infinity is $\sqrt{x}$. (I'M NOT SURE THIS IS OK)
In conclusion, Case 1 would end the search immediately, and Case 2 would only happen as the first iteration of the process, after which we would go into Case 3 for as many iterations as we want for a certain degree of precision.
You are correct in your statement in case 3:
I'M NOT SURE THIS IS OK.
Showing that your guess keeps getting smaller is not in and of itself enough. Consider the sequence:
$$2 > \frac32 > \frac54 > \frac78 > \dots > \frac{2^n + 1}{2^n} \dots > 0$$
This is always between $2$ and $0$ but never reaches $0$. The limit is in fact $1$.
Like Winther said in the comments, you need to show not just that the sequence is decreasing, but that it's actually converging on your value. If you want to go back to first principles, you need to show for any $\epsilon$, there exists a point in the algorithm where the absolute value of the error always remains bounded by $\epsilon$ from that point forward. Since you already showed the sequence is monotone, you just need to find one such point for any arbitrary $\epsilon$.