The questions are based on the below screenshots.
Can somebody explain how the functional Taylor expansion is related to a "standard" function Taylor expansion? In particular, I am concerned with this term $$ C(F+ \epsilon f) = C(F) + \epsilon <\nabla C(F), f> $$ where $<\cdot{}, \cdot{}>$ is some suitable inner product.
Why is it in general not possible to choose $f = - \nabla C(F)$?
Source: Functional Gradient Descent for combining hypotheses by Mason et al. (1999)


It's analogous to a Taylor expansion provided you define a notion of continuity and functional derivative (like Gateaux or Frechet derivatives). Once you define such concepts given a functional with some properties you can derive a Taylor expansion (first order in the case you proposed) in the same way you would do for a normal real valued function.
Not sure about the question here, but when you have a functional you want to minimize you want to find its stationary points (as necessary condition), something like this leads to
$$ \nabla C(F) = 0 $$
you can either solve this equation in closed form, if you can, or using a gradient flow (continuous version of gradient descent). If $F$ is your unknown function the gradient flow takes the form
$$ \partial_tF = - \nabla C(F) $$