How to estimate parameter functions of Sturm-Liouville operators using functional gradient descent?

Question

How to estimate parameter functions of Sturm-Liouville operators using functional gradient descent?

155 Views Asked by user28877 At 30 Mar 2026 - 3:54

Introduction

When discussing Sturm-Liouville theory (Wikipedia), the typical problems which usually came up when I studied this invonved finding eigenvalues & eigenfunctions of SL operators, use these to solve second order non-constant coefficient inhomogeneous differential equations by decomposing the forcing term using orthogonality of eigenfunctions. A fun part of this topic which I haven't encountered yet is finding the SL operator itself, given solutions to an inhomogeneous differential equation in SL form. Probably there are much better ways to do this, but in order to have a deeper understanding of functional derivatives (Wikipedia), I would like to achieve this by guessing the form of SL operator and then adjust it iteratively to make it behave more like the one I am looking for (some sort of "functional gradient descent"). Below I outline the approach in more formal terms.

Problem & proposed solution

Let's say we are given arbitrarily many $(y,f)$ pairs where $y=y(x)$, $f=f(x)$, $x\in[0,1]$, $\mathcal{Ly=f}$ and $$\mathcal{L} = -\frac{\mathrm{d}}{\mathrm{d}x}\left(\rho(x)\frac{\mathrm{d}}{\mathrm{d}x}\right)-q(x)$$.

Let's guess a $\rho_0$ and $q_0$ to create $\mathcal{L} _0$ (let $\rho_0$ and $q_0$ be reasonably behaved, bounded functions).

Take a $(y,f)$ pair, calculate $g=\mathcal{L}_0y$. To quantify how bad our guess is, introduce $E_i$, the error in $\mathcal{L_i}$:

$$E_i=\int_0^1(f-g)^2\mathrm{d}x=\int_0^1(f-\mathcal{L}_{i}y)^2\mathrm{d}x=$$

$$\int_0^1\left[f+\frac{\mathrm{d}}{\mathrm{d}x}\left(\rho_i\frac{\mathrm{d}y}{\mathrm{d}x}\right)+q_iy\right]^2\mathrm{d}x=\int_0^1\left[f+\rho_i\frac{\mathrm{d}^2y}{\mathrm{d}x^2}+\frac{\mathrm{d}\rho_i}{\mathrm{d}x}\frac{\mathrm{d}y}{\mathrm{d}x}+q_iy\right]^2\mathrm{d}x=\int_0^1F(x)dx$$ where $$F(x)=\left[f+\rho_i\frac{\mathrm{d}^2y}{\mathrm{d}x^2}+\frac{\mathrm{d}\rho_i}{\mathrm{d}x}\frac{\mathrm{d}y}{\mathrm{d}x}+q_iy\right]^2=F(\rho, \rho', \rho'', q; x)$$ ($f$ & $y$ are not listed because we know them exactly, they are not variables.)

Consider the functional derivatives of $E$:

$$\frac{\delta E}{\delta \rho} = \frac{\partial F}{\partial \rho}-\frac{\mathrm{d}}{\mathrm{d}x}\left(\frac{\mathrm{\partial}F}{\partial \rho'}\right)+\frac{\mathrm{d}^2}{\mathrm{d}x^2}\left(\frac{\partial F}{\partial \rho''}\right)$$

and

$$\frac{\delta E}{\delta q} = \frac{\partial F}{\partial q}-\frac{\mathrm{d}}{\mathrm{d}x}\left(\frac{\partial F}{\partial q'}\right)$$

In both of them, the last term will be $0$, becuase $F$ does not have $\rho''$ nor $q'$ dependence.

We can calculate all the nonzero terms as well:

$$\frac{\partial F}{\partial \rho} = 2\left(f+\rho_iy''+\rho_i'y'+q_iy\right)y''$$ $$\frac{\partial F}{\partial \rho'}= 2\left(f+\rho_iy''+\rho_i'y'+q_iy\right)y'$$ $$\frac{\partial F}{\partial q}=2\left(f+\rho_iy''+\rho_i'y'+q_iy\right)y$$

For our next $\mathcal{L}$ guess, $\mathcal{L_{i+1}}$, we can use:

For our $\mathcal{L}_{i+1}$ we will use:

$$\rho_{i+1} = \rho_i - \frac{\delta E}{\delta \rho}r_{\rho}$$

$$q_{i+1} = q_i - \frac{\delta E}{\delta q}r_{q}$$

The $r_{\rho}$ and $r_q$ are a small number, expressing a kind of "step size", ie how big steps are we making in the direction of more accurate $\rho$ and $q$ guesses. The terms with the functional derivatives can be calculated using the expressions above. I believe this is some sort of "functional gradient descent": instead of searching for the minimum of a real valued function via finding its gradients and moving along the steepest slope as done in many regression problems, here we calculate the gradient of a functional, and adjust the parameter functions of that functional so that the value of the functional moves towards a minimum.

After all this, we can take another $(y,f)$ pair, and adjust our $\rho$ and $q$ estimate. This process will hopefully make $\mathcal{L}_i$ to be closer and closer to the true $\mathcal{L}$, the one which relates all $y$s to $f$s via the $\mathcal{L}y=f$ relationship.

Numerical checking

Walkthrough

I have written a Python script the check the above approach. As an example, let's have $\rho(x)=1$, $q(x)=0$ as components of an $\mathcal{L}$ we want to find, ie: $$\mathcal{L}=-\frac{\mathrm{d}^2}{\mathrm{d}x^2}$$

As an approximation to the real number line, let's divide up the interval between $0$ and $1$ to $~10000$ chunks.

import numpy as np
xs = np.linspace(0,1,10001)

We need to be able to differentiate functions once and twice: call these functions D and DD , & let's make use of numpy's gradient function:

D = lambda f: np.gradient(f,0.0001)
DD = lambda f: np.gradient(np.gradient(f,0.0001),0.0001)

Every time we have a new estimate for $\rho$ and $q$, we need to combine them to form $\mathcal{L}_i$. So we define:

def get_SL_operator(rho, q):
    return lambda y: - D(rho*D(y)) - q*y

Our "true" SL operator, the one which we want to find, is then:

true_SL = get_SL_operator(rho=1,q=0) # for example

Let's guess $\rho_0$ and $q_0$! Since we know what the values we want to find are ($\rho=1$ and $q=0\quad \forall x$), let's make our job easy and make the initial guess pretty close to the final answer:

rho0 = 0.2*np.sin(xs*np.pi)+1 # for example
q0 = 0.08*np.cos(xs*np.pi) # for example

Define stepsize:

r_rho = r_q = 0.000005

To see how the estimation progresses, let's initialize two lists for saving $\rho_i$s and $q_i$s:

rholist = []
qlist=[]

In total, we are going to make 10 adjustments to $\rho$ and $q$ estimation. Each turn, we come up with a $y$, calculate $f=\mathcal{L}y$, using the true $\mathcal{L}$, thereby simulating that we are given arbitrarily many $(y,f)$ pairs mentioned in the beginning of the problem. Then, we will make use of the expressions for $\partial F / \partial \rho$, $\partial F / \partial \rho'$ and $\partial F / \partial q$ above to calculate $\delta E / \delta \rho$ and $\delta E / \delta q$. We use these functional derivatives to adjust our estimate for $\rho$ and $q$. So:

np.random.seed(42)
for _ in range(10):

    # --- coming up with a (y,f) pair ---

    y = 1+np.random.uniform(low=0.5, high=1)*(1+xs)**np.random.uniform(low=0.5,high=0.8)+np.random.normal()
    # ^ a somewhat random reasonably behaved function

    f = true_SL(y)

    # --- formal derivatives of the integrand ---

    delFdelrho      = 2*(f+rho0*DD(y)+D(rho0)*D(y)+q0*y)*DD(y)
    delFdelrhoprime = 2*(f+rho0*DD(y)+D(rho0)*D(y)+q0*y)*D(y)
    delFdelq        = 2*(f+rho0*DD(y)+D(rho0)*D(y)+q0*y)*y

    # --- functional derivatives ---

    deltaEdeltarho = delFdelrho - D(delFdelrhoprime)
    deltaEdeltaq   =delFdelq

    # --- calculate & save new rho and q estimates ---

    rho1 = rho0 - deltaEdeltarho*r_rho
    q1 = q - deltaEdeltaq*r_q

    rholist.append(rho1)
    qlist.append(q1)

    # --- start the next iteration with our new estimate ---

    rho0 = rho1
    q0 = q1

Putting it all together

The whole script:

import numpy as np
xs = np.linspace(0,1,10001)

D = lambda f: np.gradient(f,0.0001)
DD = lambda f: np.gradient(np.gradient(f,0.0001),0.0001)

def get_SL_operator(rho, q):
    return lambda y: - D(rho*D(y)) - q*y

true_SL = get_SL_operator(rho=1,q=0) # for example

rho0 = 0.2*np.sin(xs*np.pi)+1 # for example
q0 = 0.08*np.cos(xs*np.pi) # for example

r_rho = r_q = 0.000005

rholist = []
qlist=[]

np.random.seed(42)
for _ in range(10):

    # --- coming up with a (y,f) pair ---

    y = 1+np.random.uniform(low=0.5, high=1)*(1+xs)**np.random.uniform(low=0.5,high=0.8)+np.random.normal()
    # ^ a somewhat random reasonably behaved function

    f = true_SL(y)


    # --- formal derivatives of the integrand ---

    delFdelrho      = 2*(f+rho0*DD(y)+D(rho0)*D(y)+q0*y)*DD(y)
    delFdelrhoprime = 2*(f+rho0*DD(y)+D(rho0)*D(y)+q0*y)*D(y)
    delFdelq        = 2*(f+rho0*DD(y)+D(rho0)*D(y)+q0*y)*y

    # --- functional derivatives ---

    deltaEdeltarho = delFdelrho - D(delFdelrhoprime)
    deltaEdeltaq   =delFdelq

    # --- calculate & save new rho and q estimates ---

    rho1 = rho0 - deltaEdeltarho*r_rho
    q1 = q0 - deltaEdeltaq*r_q

    rholist.append(rho1)
    qlist.append(q1)

    # --- start the next iteration with our new estimate ---

    rho0 = rho1
    q0 = q1

The whole code is available here on Google Colab, where it can be run without installing anything.

Visualize results

Let's plot how the $\rho_i$s and $q_i$s looked like at different stages of the above process. The estimates will be marked by red, while the true values ($y=1$ and $y=0$) will be represented by black lines.

import matplotlib.pyplot as plt
for i in [0,2,3,4]:
    plt.figure(figsize=(6,6))
    plt.plot(xs,rholist[i],c='r',zorder=-i)
    plt.plot(xs,qlist[i],c='r',zorder=-i)
    plt.plot(xs,np.ones(np.shape(xs)),c='k')
    plt.plot(xs,np.zeros(np.shape(xs)),c='k')
    plt.title(f'Iteration: {i}')

Output:

As you can see, the estimate for $\rho$ starts to blow up at around $x=0$. This effect continues later as well, completely ruining any convergence to the true $y=1$ value.

Question

Is the theoretical background for estimating parameter functions of Sturm-Liouville operators presented above correct?

If yes, then why the numerical check does not confirm it, or if not, which part is wrong?

Original Q&A

How to estimate parameter functions of Sturm-Liouville operators using functional gradient descent?

Introduction

Problem & proposed solution

Numerical checking

Walkthrough

Putting it all together

Visualize results

Question

Related Questions in PYTHON

Related Questions in STURM-LIOUVILLE

Trending Questions

Popular # Hahtags

Popular Questions