I have a software which takes input as investment and gives the output as return on a particular stock. Now profit metric $x_i$ is defined as the ratio of return $g_i$ to maximum possible return $g_{max}$ at $i^{th}$ period of time.
so $x_i = g_i / g_{max}$
$g_{max}$ is constant for any period but varies from stock to stock. Return for a particular stock at $i_{th}$ period is $g_i$ which depends on the investment $s_i$ for that period. The relation between $g_i$ and $s_i$ varies from one period to other period.
The question is how to decide $s_i$ based on $s_{i-1}$ and $g_{i-1}$ in order to maximize $x_i$ for a particular stock . Is it possible to build an online learning mechanism to decide the value of $s_i$ based on previous data.
[note : $ 0 < g_i \leq g_{max}$ , $ s_{min} \leq s_i \leq s_{max}$ , $i_{th}$ time period is fixed ,in this case it's six months ]
You are looking into a direct optimization problem. You want to optimize a function x(s), but a mathematical expression for the functional dependency is not available; instead, you have an "oracle" which provides the values when queried. The literature calls this type of problem "derivative-free optimization", besides "direct optimization".