Consider the following model.
Each period t=0,1,..., an agent makes an effort $x\in R_+$ to solve a problem. The value from solving the problem is $V>0$. The relationship between effort and problem solving is stochastic. So for any effort $x$ the probability of solving the problem is $p(x)$ (which is strictly increasing in $x$). And there is a cost associated with effort $c(x)$ (which is a quadratic function increasing in $x$). If the effort fails in one period, the agent can try anew in the next period (effort in one period does not affect the probability of success in the next). However, there is a discount factor $\delta\in(0,1)$ such that the agent may prefer to solve the problem now to solve it later.
If so, the decision problem at each period $t$ would be: $$ \max_{x_t \in R} U = p(x) \cdot V - c(x) + (1-p(x))\cdot \delta \cdot U $$
where $U$ denotes the present value of the future payoffs of the agent.
What the agent should do to solve this problem optimally?