Domain and codomain of a generalization of the gradient

148 Views Asked by At

Assuming we have a differentiable function $$ f: \mathcal{X} \times \mathcal{X} \to \mathbb{R} $$ where $ \mathcal{X} = \mathbb{R} \times \mathbb{N}$. What are the domain and codomain of $\nabla f$?

This type of functions appear when the definition of the Reproducing kernel Hilbert space is extended to vector-valued functions.

An example would be a function where the integer elements of the input pairs are used to extract entries of a matrix. More specifically, if $A$ is a positive semi-definite matrix and $g: \mathbb{R}^2 \to \mathbb{R}$, we can define such a function as $$ f((x,i),(y,j)) = g(x,y)A_{ij} $$

1

There are 1 best solutions below

0
On

Since no one answered so far. Here are my thoughts. I am not an expert on this!

Maybe this is interesting for you: https://en.wikipedia.org/wiki/Discrete_calculus#Calculus_of_differences_and_sums


My thoughts below are more about what doesn't work:

There is no single choice for a discrete differential. For specific applications, you have different preferences.

Usually, the derivative of a function evaluated at a point $x$ is the best linear approximation of $f$ around $x$. If $f$ was given over manifolds, $f : \mathcal M \to \mathcal N$, then the derivative at $x \in \mathcal M$ is a linear map between the tangent spaces $$ Df(x) : T_x \mathcal M \to T_{f(x)} \mathcal N. $$ (Think of the tangent spaces are the linear approximation of the nonlinear manifolds around that point.)

If $f$ is a map between linear spaces, $f : X \to Y$, then you don't need to use the tangent spaces since the domain and target are already linear, hence, the derivative is a linear map $$ Df(x) : X \to Y. $$ The defining property is $$ \quad \Vert f(x+v) - (f(x) + Df(x) v) \Vert \to 0 \quad \text{(for }\Vert v \Vert \to 0\text{)} \quad\quad (1). $$

How does this fail for $\mathbb{N}$?

  • $\mathbb{N}$ is not a linear space. There is no clear notion of what a linear approximation could be.
  • The is no way to ask for the same defining property like above in Eq. $(1)$. Functions $f : \mathbb{N} \to \mathbb{R}$ can behave very arbitrary. Hence, you only have the choice to pick that one vector $v$ in Eq. $(1)$ and require that at least this holds. For example, we can pick $v = 1$ and then require that $$ f(x) + \Delta_+ f(x+1) \cdot 1 = f(x+1) $$

How about $\mathbb{N}^2$?

I actually don't know! Probably you have a lot of bad choices to pick from.

Let's pick: $$ \Delta_+ f(x,y) = ( f(x+1,y) - f(x,y), \, f(x,y+1) - f(x,y) ) \in \mathbb{R}^{1 \times 2} $$ (Yes, there are many options how we could interpret this. Is it a linear map? Did we obtain it by extending $f$ to a map $\overline{f} : \mathbb R^2 \to \mathbb R$? Maybe. Anyway, I just consider it as a row vector without getting into this mess.)

This corresponds to requiring equation $(1)$ for $v = (1,0)^T$ and $v = (0,1)^T$. Hence, ​we get: $$ f(x+1,y) = f(x,y) + \Delta_+ f(x,y) \begin{pmatrix} 1 \\ 0 \end{pmatrix} $$ and $$ f(x,y+1) = f(x,y) + \Delta_+ f(x,y) \begin{pmatrix} 0 \\ 1 \end{pmatrix}. $$ Some nice properties follow from that. But most properties of normal derivatives do not hold!

For example, there is no hope for symmetry of second derivatives.

What can a discrete derivative do?

There are calculus rules for discrete derivatives, which are usually not too hard to prove.

To give one example:

The continuous rule $$\int_a^b f'(x) \, \mathrm{d} x = f(a) - f(b)$$ ​becomes for $f : \mathbb{N} \to \mathbb{R}$: $$ \sum_{i=a}^{b-1} \Delta_{+} f(i) = f(a) - f(b). $$ Asking for properties related to sums is easier and a weaker requirement in some sense.

This gives rise to a more abstract definition of what a forward difference is: Quote Wikipedia: It is a $1$-chain. (https://en.wikipedia.org/wiki/Chain_(algebraic_topology))

You can think of a $1$-chain as an object which has a starting point and an endpoint (e.g. $x$ and $x+1$) and you can assign a value to it which basically says how much a quantity of interest changes along that $1$-chain. So, it exactly carries the information the finite-difference has. You can append chains together, i.e. $\Delta_+ f(x+1) '+' \Delta_+f(x)$ is a $1$-chain from $x$ to $x+2$ which tells you how much $f$ changes from $x$ to $x+2$.

These $1$-chain's are objects from algebraic topology/differential geometry which are related to exterior derivatives. One motivation of this theory is to generalise the Stokes theorem (https://en.wikipedia.org/wiki/Generalized_Stokes_theorem).

If these things are interesting for you, you can read up a bit on discrete exterior calculus, etc. But it's for sure not the most practical way if you just want to prove simple properties for you discrete gradient.