I am not very deep in advanced real analysis. Could you help me decipher the following two phrases hold?
1) if $f$ is in Banach space $\mathcal{B}$, then $\nabla f $ is in the dual space $\mathcal{B}^*$
2) $f$ is in Banach space $\mathcal{B}$ and $g $ is in the dual space $\mathcal{B}^*$, then $f+g$ does NOT make sense.
Remark: The source of the phrases are from [1]:
"we are optimizing the function in some Banach space $\mathcal{B}$ (for example $\mathcal{B} = \ell_1)$. In that case the Gradient Descent strategy does not even make sense: indeed the gradients (more formally the Fréchet derivative) $\nabla f(x)$ are elements of the dual space $\mathcal{B}^*$ and thus one cannot perform the computation $x - \eta \nabla f(x)$ (it simply does not make sense)."
[1] https://blogs.princeton.edu/imabandit/2013/04/16/orf523-mirror-descent-part-iii/
Let $f : V \to W$ be a function between two Banach spaces.
Then by definitnion, the Frechet derivative at $x$ is the only bounded (=continuous) linear operator such that... Hence $\nabla f(x) \in L(V,W)$. So $\nabla f(x) \in V'$ if and only if $W = \Bbb R$ (or $\Bbb C$).
For your question 2), if $f:V\to W$, then $\nabla f(x) : V\to W$, so it doesn't makes sense to add an element of $V$ and a function $V \to W$. they are not the same kind of objects.