What is the difference between optimization on Banach space versus optimization on Hilbert space?

614 Views Asked by At

In Chapter 4 of this book, it says,

Suppose now that we are interested in the more general situation of optimization in some Banach space $B$. In other words the norm that we use to measure the various quantity of interest does not derive from an inner product (think of $B = l_1$ for example). In that case the gradient descent strategy does not even make sense...

Can someone clarify what it means to "optimize" in some Banach space (as opposed to Hilbert space)? And what an example of it looks like?

1

There are 1 best solutions below

2
On

In most work in computational optimization, we minimize a function $f(x)$ which maps a vector $x$ in $R^{n}$ into a real number. There may be additional constraints that can be expressed as equations $g_{i}(x)=0$ or $h_{i}(x)<0$. In solving these problems we can build on lots of familiar tools from analysis and linear algebra, including the concept of the gradient, the Euclidean norm, the dot product, etc.

In some situations, it's desirable to minimize a functional $J$ of a variable function, $f(x)$. Here $f$, rather than being a vector in $R^{n}$ is a function that lives in some function space. $J[]$ maps any function in that function space to a real number.

As an example of this kind of optimization problem, consider the Linear Quadratic Regulator problem in optimal control:

$\min \int_{0}^{\infty} x(t)^{T}Qx(t) + u(t)^{T}Ru(t) dt$

subject to the ordinary differential equation constraint

$\dot{x}(t)=Ax(t)+Bu(t)$

$x(0)=c$

Here $Q$ and $R$ are symmetric and positive definite matrices and $x(t)$ and $u(t)$ are vector valued functions of time. The objective is to pick a function $u(t)$ so that $x(t)$ can be found that satisfies the differential equation and so that the objective functional is minimized.

If the function space has a well-behaved dot product then a bunch of theory from calculus and linear algebra is applicable and you have what's called a Hilbert space. With a few critical exceptions, your intuition from $R^{n}$ will work reasonably well in a Hilbert space. If the function space has a well-behaved norm but no equivalent of the dot product, then the function space is a Banach space. This is a very limiting condition to work with- much of what you know from calculus and linear algebra simply isn't applicable in a Banach space. The properties of Hilbert and Banach spaces are studied as part of Functional Analysis, which is typically taught in Ph.D. level mathematics programs. If you aren't familiar with this subject then you've got a lot of preparatory work to do before you can follow a reference like the one that you've cited.

If you're only interested in optimization problems where the unknown is a vector in $R^{n}$ or perhaps matrix in $R^{m \times n}$, then you don't need any of this functional analysis and the approach used in your reference will be overly complicated for your needs. There are many other texts which introduce convex optimization in a more accessible way by avoiding function spaces.

If you are interested in optimization on Hilbert and Banach spaces, one reasonably accessible introduction (aimed at readers interested in optimal control problems) is Luenberger's Optimization by Vector Space Methods.