I am trying to implement a backtracking line search using the Armijo-Goldstein condition to adapt my stepsize $t$ adaptively.
I start with two paramteters $\alpha \in (0, 1/2)$ and $\beta \in (0.1)$, and my starting condition $t = 1$. My input is $x \in \mathbb{R}^2$
I repeat $t:=\beta*t$ until the Armijo - Goldstein condition is met:
$$f(x+t\Delta x) < f(x) + \alpha t\nabla f(x)^T\Delta x $$
I understand parts of it, but am also confused on how to implement this in code-form.
I dont really understand what $\Delta x$ means in this context and I dont understand either why I would need to use the transpose of my gradient $\nabla f(x)$ (does the transpose switch the gradient vector from ascent to descent?). I know it has to do with directional derivatives, but maybe im just confused with the notation.
Here a bit of pseudocode:
while f(x + t*delta_x)<(f(x)+alpha+ np.transpose(gradient(x))*delta_x):
t*=beta
Thank you for your help