I am currently trying to implement an iterative algorithm in C# for an optimization problem, but I am currently stuck on with the math behind.
I have $n$ blackboxes, each of which expects an input $x$ and outputs $f(x)$. Each blackbox is also parameterized with constants that are individual to each blackbox, but are constant.
The blackboxes are now connected in series, i.e., the output of $Blackbox_0$ (we call it $f_0(x_0)$ is the input of $Blackbox_1$, the output of $Blackbox_1$ is in turn the input of $Blackbox_2$ and so on until $Blackbox_n$.
$$ x_0 \rightarrow Blackbox_0 \rightarrow x_1 \rightarrow Blackbox_1 \rightarrow x_2 \rightarrow ... \rightarrow x_n \rightarrow Blackbox_n \rightarrow x_{n+1}$$
$$ x_0 \rightarrow Blackbox_0 \rightarrow f_0(x_0) \rightarrow Blackbox_1 \rightarrow f_1(x_1) \rightarrow ... \rightarrow x_n \rightarrow Blackbox_n \rightarrow f_n(x_n) $$
The function f is defined by:
$$ f_n(x)\ :=\ (\frac{1}{\frac{x}{L_n}+\sqrt{P_n}}-\frac{1}{\sqrt{P_n}})\ *\ L_n $$
$L_n$ and $P_n$ are constants, but they are individual per blackbox.
Mathematically, I would now have defined this as having an n-fold concatenation of the function $f_n(x)$
$$ f_n(x) = f_n(x) \circ ... f_2(x) \circ f_1(x) \circ f_0(x) = f_n(...f_2(f_1(f_0(x)))) $$
Now I am searching for the input $x_0$ of the $Blackbox_0$, which has the property that the difference $d(x)$) between the input of the first blackbox and the output of the last blackbox is maximal.
$$ d(x)\ :=\ x_{n+1} - x_0 = f_n(x_n) - x_0 $$
Mathematically seen the extreme points of the difference function are looked for, i.e. the zeros of their 1st derivative.
$$ d'(x)\ =\ 0 $$
I am now stuck that I need to implement an algorithm that is best iterative and also works with arbitrary number of $n$ blackboxes. In plain english: what is the best input $x_0$ for the first blackbox so that the output of the last blackbox is maximal.
Who can help me? Pseudo-code is sufficient just to get an approach.
Is it possible to calculate the derivative of the n-fold concatenation of the function? I would like to avoid numeric approaches if possible since the optimization is time-critical and performed on a large dataset.
This is just the chain rule, $$ \frac{d}{dx_0}(f_n∘...∘f_1∘f_0)(x_0)=f_n'(x_n)f_{n-1}'(x_{n-1})...f_1'(x_1)f_0'(x_0) $$ You still need to find the individual derivatives, via symbolic differentiation or via divided differences.