There are many great posts on the Stack discussing intuitions of isomorphisms and some discuss isomorphisms between vector spaces. However, I have not find the answers I found very helpful in my mind, and hence this new post. Consider below:
Given $\phi:E \to (-\infty, \infty]$, we call the convex conjugate of $\phi$ to be $\phi^*: E^* \to (-\infty, \infty]$ defined as $\phi^*(f) = \sup_{x \in E} (f(x) - \phi(x))$ for all $f \in E^*$.
Note that $\phi^*(f) \geq f(x) - \phi(x)$ for all $x \in E$ and so we have $f(x) \leq \phi(x) + \phi^*(f)$ for all $x \in E$ and $f \in E^*$.
From here (this is where my confusion start), we can discuss the following example:
Let $E = \mathbb{R}$, $\phi(x) = \frac{1}{p}|x|^p$ and $\phi^*(y) = \frac{1}{p'}|y|^{p'}$ where $\frac{1}{p} + \frac{1}{p'} = 1$. Then from the inequality above, we obtain the classical Young Inequality: $$ xy \leq \frac{|x|^p}{p} + \frac{|y|^{p'}}{p'}. $$
My question is, what do we even mean by $\phi^*(y)$? It seems to me that $y \in \mathbb{R}$ whereas the domain of $\phi^*$ is $E^*$, making this expression no sense to me. I know this perhaps have something to do with isomorphisms? Identifying one element in a space with another element in another space while preserving space structure. However, to me what isomorphism does is that it preserves how elements act with each other in the same space, whereas, in some sense, here we are essentially just treating $f$ exactly as $y$, including how they act on elements of other spaces, which to me is not really what isomorphisms do. Can someone please explain this in detail? I have seen many times such argument, but have never made sense of it. In general, what properties of isomorphism have guaranteed us that we can do this, without thinking of the consequences? What is the rigorous way of seeing this?
Update after Comments: I think I still have some remaining concerns of my question, but here is what I have understood so far from the comment:
Let $\phi: \mathbf{R} \to (-\infty, \infty]$ be a convex function such that $\phi(x) = \frac{1}{p}|x|^p$. We note that since $\mathbb{R}$ is finite dimensional, it is isomorphic to its dual, that is $\mathbb{R}^*$. (This part I am not sure, again there are many isomorphisms between the two spaces, what makes the one mentioned in the comment special? I still have not understood from the comment.)
Anyways, we can obtain a canonical isomorphism $J: \mathbb{R}^* \to \mathbb{R}$ through the following way:
Notice that we have $$ \mathbb{R}^* = \{ f: \mathbb{R} \to \mathbb{R} | f(x) = mx, m \in \mathbb{R} \} \equiv A. $$ To see this, let $f \in \mathbb{R}^*$, then for all $x \in \mathbb{R}$, we have $f(x) = f(1 \cdot x) = f(1) x = mx$. Thus $f \in A$. Conversely, if $f \in A$, we have $f(x) = mx$. This is a continuous linear function on $\mathbb{R}$. Hence $f \in \mathbb{R}^*$.
Great, let $J: \mathbb{R}^* \to \mathbb{R}$ be such that $J(f) = m = f(1) \in \mathbb{R}$. Why is this injective? If $J(f_1) = J(f_2)$, then $f_1(1) = f_2(1)$. But $f_1, f_2 \in \mathbb{R}^*$, so by the above comment, $f_1 = f_1(1)x$ and $f_2 = f_2(1)x$. Thus $f_1 = f_2$ as $f_1(1) = f_2(1)$. Why is this surjective? Given $m \in \mathbb{R}$, we have $f \in \mathbb{R}^*$ defined as $f(x) = mx$. Indeed, this shows $J$ is an isomorphism.
Now $\phi^*: \mathbb{R}^* \to \mathbb{R}$ is defined as $\phi^*(f) = \sup_x (f(x) - \phi(x))$ by definition. Now we need to make sense of $\phi^*(y)$ for $y \in \mathbb{R}$. Because of the isomorphism $J$, we can define safely that $$\phi^*(y) := \phi^*(J^{-1}(y)) = \phi^*(f) = \phi^*(yx) = \sup_{x \in \mathbb{R}} (xy - \phi(x)) = \sup_{x \in \mathbb{R}} (xy - \frac{1}{p}|x|^p).$$ Equivalently, we are effectively constructing a new function $\tilde{\phi}^*: \mathbb{R} \to \mathbb{R}$ such that for all $y \in \mathbb{R}$ $$ \tilde{\phi}^*(y) := \phi^*(J^{-1}(y)) = \phi^*(f) = \phi^*(yx) = \sup_{x \in \mathbb{R}} (xy - \frac{1}{p}|x|^p) \equiv: \phi^*(y). $$
Since we have $$ \sup_{x \in \mathbb{R}} (xy - \frac{|x|^p}{p}) = \frac{|y|^{p'}}{p'} $$ where we omit the computation (cases separate into $x > 0$ and $x < 0$, then we have sub-cases for y etc.) Nice, so we have our desired result, indeed, by choosing $f(x) = yx$ for some $y \in \mathbb{R}$, we have $$ xy = f(x) \leq \phi(x) + \phi^*(f) = \phi(x) + \phi^*(J^{-1}(y)) = \phi(x) + \phi^*(y) = \frac{|x|^p}{p} + \frac{|y|^{p'}}{p'}. $$ This concludes the discussion.
Again, I am still confused about the isomorphism part (bolded part above). Moreover, do we have to go through all of these kind of reasoning all over again the next time when we encounter a similar situation? Are there some key steps that we can just check and safely conclude that we can indeed define $\phi^*(y)$ just like above in future scenarios?
Some other words of thoughts:
I might have realized the answer to my last question above: as long as we know the form of the isomorphism between two spaces, in this case, $\mathbb{R}$ and $\mathbb{R}^*$, we can simply shove whatever that the equivalent form of the element from the primal space into the "equation" that we have, and substitute out the original primal form of that element. This is because the same reasoning presented above applies over and over to other scenarios. Please correct me if this is not the way we should think of it.