Why can I replace both sides of the interval in the bisection method for optimization?

31 Views Asked by At

In the bisection method for optimization, we look at the first derivative and then depending on whether it is positive or negative replace the boundary point a or b. I tried to implement it in python and it worked (at least for the simple examples I replicated). But it seems that I can arbitrarily change a=c and b=c positions and the code still works. Does anyone have an intuitive explanation for this or did I make an implementation mistake? In particular if you look at the picture, if I replace the wrong variable a,b with c I am left with the interval, where the optimum is not contained anymore, how can the code still be correct then?

enter image description here

Updated Code:

import numpy as np
import matplotlib.pyplot as plt

def deriv(f, x, a=0.001):
   
   x1 = x
   x2 = x + a
   
   f_x1 = f(x1)
   f_x2 = f(x2)
   
   return (f_x2 - f_x1) / (x2 - x1)


def Bisection(f, R, tol=1e-9, nIter=100):
   
   a, b = R
   
   for it in range(nIter):
       
       c = (a+b) / 2
       
       df_c = deriv(f, c)
       
       if abs(df_c) < tol:
           break
       
       elif df_c > 0:
           a = c
           
       else:
           b = c
           
   c = (a + b) / 2
   
   return c


def alt_Bisection(f, R, tol=1e-9, nIter=100):
   
   a, b = R
   
   for it in range(nIter):
       
       c = (a+b) / 2
       
       df_c = deriv(f, c)
       
       if abs(df_c) < tol:
           break
       
       elif df_c > 0:
           b = c
           
       else:
           a = c
           
   c = (a + b) / 2
   
   return c


x = np.linspace(0,10, 101)
f = lambda x: -(x-5)**2
R = [2,8]

x_opt = Bisection(f, R)

plt.plot(x, f(x))
plt.plot(x_opt, f(x_opt), "ro")

```
1

There are 1 best solutions below

3
On BEST ANSWER

The code is no longer correct if you make the changes in the appropriate lines. The code actually still works for R=[2, 8] because of the lucky coincidence that the function is completely symmetrical over the midpoint of that interval. Simply change R to something else, for example to [0, 6] or [2, 10] or something and you will see that the changes break the code behavior.


Example demonstrating how the changes in the code actually return incorrect results:

def Bi

section(f, df, R, nIter=100, tol=1e-7, original=False):
    a,b = R
    for it in range(nIter):
        c = (a+b) / 2      
        df_c = df(c) # f'(x_c)
        if abs(df_c) < tol:
            break
        elif df_c > 0:
            if original:
                a = c
            else:
                b = c
        else:
            if original:
                b = c
            else:
                a = c
            
    c = (a+b)/2
    return c

import numpy as np
x = np.linspace(0,10, 101)
f = lambda x: -(x-5)**2
df = lambda x: -2*(x-5)
R = [2,10]

x_opt_original = Bisection(f, df, R, original=True)
x_opt_changed = Bisection(f, df, R, original=False)

print("Original method returns %f" % x_opt_original) # Prints 5.0, as expected
print("Changed method returns %f" % x_opt_changed) # Prints 10.0, which is incorrect!

from matplotlib import pyplot as plt
plt.plot(x, f(x))
plt.plot(x_opt_original, f(x_opt_original), "ro")
plt.plot(x_opt_changed, f(x_opt_changed), "go")
plt.show()

the code also produces the following graph, where the red dot is found by the original method, while the green is found by the incorrectly modified method:

enter image description here