tl;dr (Too Long To Read):
What is the intuition/conceptual idea to why Rudin used the number:
$$ k = \frac{y^n-x}{n y^{n-1}} $$
in his proof and not some other number? It seems that that number is not random, so how could he have come up with it?
My attempt:
I was trying to prove theorem 1.21 from Rudin's real analysis book on my own without looking at Rudin's proof. The way I attempted to prove it was to try to prove what seemed true from my already collected intuition on the real numbers before I started to study real analysis. So I drew lots of picture which lead me to think of defining $E = \{ \bar y \in R_{>0} : \bar y < x^{1/n} \}$:
Thus, I knew that $E = \{ \bar{y} \in R_{>0} : \bar y^n < x \}$ is just the same as $E = \{ \bar y \in R_{>0} : \bar y < x^{1/n} \}$. So it was obvious that what I had to show was that the $\alpha = supE = y = x^{1/n}$ (also the reason for defining sets like that is cuz we probably need to use the Least Upper Bound (LUB) property cuz its one of the few things we are suppose to know about analysis so far). Thus, I proceed to show $E$ is bounded and non-empty so that I was guaranteed that the sup existed since we assumed the Least Upper Bound (LUB) property (a.k.a. the completeness axiom).
Then I thought I want $y = \alpha$ so one option was to try showing $y < \alpha$ AND $y > \alpha$ are false, so by trichotomy $y=\alpha$. Since I didn't have direct access to $y$ I decided to take the same strategy except with $y^n = x$ instead of $y$ and $\alpha^n$ instead of $\alpha$. Intuitively I thought well lets assume $\alpha^n < x$ and $ x < \alpha^n $. The first one is too small so hopefully it should lead to some contradiction and perhaps show $\alpha < y$ is false. Similarly the other one $ x < \alpha^n $ should be too large somehow. Then maybe we can use trichotomy to get $\alpha^n = x$ which completes the proof.
I attempted the first one $\alpha^n < x$. The only other thing I knew about $\alpha$ was that $\forall \bar y \in E, \bar y^n < x$. Then I decided to combine both equation (since I was hinted to use $a^n - b^n$ cuz I saw that identity in the soln when I check for my solution that E was bounded and non-empty):
$$ \alpha^n - \bar y^n < x - \bar y ^n < 0$$
then because of the hint (that I probably wouldn't had realized I needed to use) I was extremely lucky and decided to subtract $\bar y^n$ from both sides (notice that if I would have decided to subtract by $\alpha^n$ it wouldn't have worked):
$$ \alpha^n - \bar y^n = (\alpha - \bar y)\left(\sum_{0\leq i+j \leq n-1} \bar y^i \alpha^j \right) = (\alpha - \bar y)K < 0$$
where I noticed that $K > 0$ since every element of E is greater than zero and so is its least upper bound $\alpha$. Thus $K > 0$ and getting rid of it gets me:
$$ (\alpha - \bar y) < 0 \implies \alpha < \bar y$$
which is obviously false since that would imply that $ \alpha$ is not an upper bound. Thus $\alpha^n < x$ is false.
Now assume $x < \alpha^n $. One can't use the same argument as in my previous attempt because this time we are trying to create an element that is an upper bound smaller than $\alpha$ and its not clear how elements form $E$ are useful.
Its clear to me we have to choose an $h$ such that:
$$ x < (\alpha - h)^n < \alpha^n$$
I have given it a proper attempt (see at the end of my question) but I am unable to prove the desired result no matter how much I play with the algebra and the known facts I have.
Thus, my question is how did Rudin come up with the following:
$$k = \frac{y^n-x}{n y^{n-1}}$$
from his explanation it seems it just came out of a hat. I am sure if I plugged it in I would see that "it works" however, I wanted to know/see how to come up with it myself.
Similarly I don't see how/why he came up with this one:
$$h < \frac{x - y^n}{n(y+1)^{n-1}}$$
it seems that its not even required considering my first proof/argument, but I assume it must use the same idea considering it seems it used the same identity $b^n - a^n = (b-a)(b^{n-1}+b^{n-2}a+ \cdots + b a^{n-2} + a^{n-1})$.
Anyone care to share what is the trick I missed? Is there some way to understand how one would have come up with using that? Is there some conceptual idea for the proof that he did not make explicit that I missed?
I am hoping to get a more satisfying proof than just feeling I played around with symbols until I forced the paper to tell me the truth. It seems I missed some insights because even with the hints (like using the identity) didn't yield me a solution.
What I tried:
If we have:
$$ x < \alpha^n$$
then at least intuitively, that must imply that there must be some element $y_{BAD}$ smaller than our supremum $\alpha$ that is still an upper bound (this intuition is because we are going under the assumption that $\alpha^n = x \iff \alpha = x^{1/n}$). Therefore it seems reasonable to try to decrease $\alpha$ the right amount $h$ such that:
$$ x < (\alpha - h)^n < \alpha^n $$
then since $h$ is some distance that we go down from $\alpha$ we probably don't need to go down further than $\alpha$ so it seems reasonable to require $0 < h < \alpha$. With that we have using algebra:
$$ x < (\alpha - h)^n = (\alpha - h)\left( \alpha^{n-1}+\alpha^{n-2}h + \dots + \alpha h^{n-2} + h^{n-1} \right) = (\alpha - h) K < \alpha^n $$
the reason we did that factorization is so that we can hopefully get some inequality for $h$ (intuitively, think that we are trying to make $h$ the subject so that we can choose the right one to get the contradiction we need). Therefore lets try to remove all the nasty exponents with $h$ by assuming $ h < \alpha$ (otherwise $\alpha$ decreases by too much):
$$ K = \alpha^{n-1}+\alpha^{n-2}h + \dots + \alpha h^{n-2} + h^{n-1} < n \alpha^{n-1}$$
so lets plug it into:
$$ x < (\alpha - h) K < \alpha^n $$
after plugging the inequality and leaving $h$ alone and some algebra I skipped I got:
$$ \alpha - \frac{ \alpha^n }{K} < h < \alpha - \frac{x}{n \alpha^{n-1}}$$
unfortunately I don't I didn't manage to plug in $K < n \alpha^{n-1}$ successfully to both sides so I got stuck with the above (which is still in terms of $h^j, j>1$) which still unfortunately has high order terms for $h$...so close it feels... (note I've also tried more things but it would be ridiculous to put it all in here).
so I feel got some of the main insights:
- Require the constraint $x < (\alpha - y)^n < \alpha^n$ using $x<\alpha^n$ and $h>0$.
- use $(\alpha - h)^n = (\alpha - h)\left( \alpha^{n-1}+\alpha^{n-2}h + \dots + \alpha h^{n-2} + h^{n-1} < n \alpha^{n-1} \right)$ to get $h$ alone.
- Use the upper bound on $K < n \alpha^{n-1}$ to remove the higher order terms of $h$ that are annoying (since we are assuming we don't know how to take roots). i.e. (\alpha - h)K < (\alpha - h)n \alpha^{n-1}
those seem the main ingredients but when I tried putting them together it seemed I was still missing something because playing with the algebra didn't lead to the answer. Someone know what it is?
First, Rudin is using $n\gt0$, and the inequality:
$$b^n-a^n\lt (b-a)nb^{n-1}$$
only holds for $n\in\mathbb{Z}, n\gt1$.
But this isn't massively important here.
We are trying to find a contradiction to $y^n\lt x$, where $x$ is given and $y=\sup E$.
A good place to begin is with a $y+h$, $h$ arbitrarily small and positive. In which case we get:
$$(y+h)^n-y^n \lt hn(y+h)^{n-1}$$
by the above inequality (for $n\gt1$!)
As $h$ is arbitrarily small, we can say $h\lt 1$, and so we continue:
$$(y+h)^n-y^n \lt hn(y+h)^{n-1}\lt hn(y+1)^{n-1}$$
We now use the fact that we are trying to find a contradiction, $(y+h)^n\lt x$ for example, which gives us the contradiction Rudin then uses.
So we need:
$$hn(y+1)^{n-1}\lt x-y^n$$
or:
$$h\lt \frac{x-y^n}{n(y+1)^{n-1}}$$
We can see that:
$$0\lt \frac{x-y^n}{n(y+1)^{n-1}}$$
and so we are free to pick $0\lt h\lt 1$, as required.
$k=\dfrac{y^n-x}{ny^{n-1}}$ is similar, except for $k$ is already fixed, and $k\lt y$ by the definitions given.