Mathematically, how is SciPy optimizing a complicated function?

66 Views Asked by At

My general question is how does SciPy find the derivative of a function that involves for loops, and if statements, and multiple mathematical operations?

I have two functions as follows (although the specifics of the functions aren't that important to this question):

def binomial(sigma, S, K, r, t, N, option_type = 'C'):

t = 1; t = t / (N-1)

S0 = S

u =  np.exp(sigma * np.sqrt(t))
d = 1/u
p = (np.exp(r * t) - d) / (u - d)

stock_prices  = np.zeros( (N, N) )
option_prices = np.zeros( (N, N) )

stock_prices[0, 0] = S0

#  Fill out the remaining values
for i in range(1, N ):
    M = i + 1
    stock_prices[i, 0] = d * stock_prices[i-1, 0]
    for j in range(1, M ):
        stock_prices[i, j] = u * stock_prices[i - 1, j - 1]

#  Calculate the option price at expiration.  if the call price is less than zero, it is out-of-the-money so we replace those values with zero.
if option_type == 'C':
    expiration = stock_prices[-1,:] - K
else:
    expiration = K - stock_prices[-1,:]
    
expiration.shape = (expiration.size, )
expiration = np.where(expiration >= 0, expiration, 0)

#  Set the last row of the call matrix to our expiration values
option_prices[-1,:] =  expiration

#  Backpropagate to fill out our tree
for i in range(N - 2,-1,-1):
    for j in range(i + 1):
        option_prices[i,j] = np.exp(-r * t) * ((1-p) * option_prices[i+1,j] + p * option_prices[i+1,j+1])

return option_prices[0,0]
def objective(sigma, S, K, r, t, N, C0):
    res = binomial(sigma, S, K, r, t, N) - C0
    return res

Calling scipy.optimize.root on the objective function returns a value for sigma:

root(objective, 0.25, args = args)

SciPy uses the Levenberg-Marquardt algorithm to optimize a function. As I understand it, at every iteration Levenberg-Marquardt algorithm either chooses gradient descent or the Gauss-Newton algorithm to update the solution. For the sake of simplicity, let's assume that SciPy only uses gradient descent as to optimize the objective function.

To my understanding, gradient descent works by using backpropagation to compute the gradients of the objective function and then updates its initial guess about the local minima of the function by those gradients with respect to a step size $\alpha$:

$x_{n+1} = x_n-\alpha\Delta f(x_n)$

My source of confusion stems from how SciPy computes the gradient for my objective function. I don't believe the several for loops and if statements can be expressed in one, differentiable function, so I'm not entirely sure how the gradients of the function are computed.

From my research online, my best guess is that the Jacobian matrix is computed using the partial derivatives of each function argument (although I have no idea how that works practically in such a case and I could very well be completely wrong).