Why does optimal control always have optimal substructure?

Question

Why does optimal control always have optimal substructure?

91 Views Asked by Bumbble Comm At 28 Mar 2026 - 1:05

I've seen a lot of phrases relating to solving optimal control problems, like "Bellman equation, " "Hamilton-Jacobi-Bellman equation," and so forth. My (amateur) understanding of these theories are that dynamic programming can usually be used to solve optimal control problems, because optimal solutions to subproblems are useful.

But I'm a little confused by this. Let's say I had a pathological plant that had a button that, if pressed at $t_0$, prevents some catastrophic event occurring in the plant at time $t_1$ (that may result in my loss of control authority). I have a control set $u(t)$ that includes this button. If I synthesize an optimal control over horizon $t = 0\cdots T < t_1$, then I may never press this button because I would never observe the event. However, if I ran optimal control over horizon $T > t_1$, then I would want to have pressed the button at the beginning. But if I used the solution to the subproblem from the first scenario, I wouldn't be optimal, because I would have needed to "look ahead" at $t_0$ until at least $t_1$ to see that I needed to press the button.

Am I making a silly error here? I'm wondering how optimal substructure can always be assumed. (Or maybe I'm totally missing a set of assumptions.)

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

The subproblems related to dynamical programming start as far into the future as the original problem, so at $t=T$, and then go some duration back in time, so $t=\tau$ with $\tau<T$. The optimal solution from this subproblem for the control input in the interval $\tau < t < T$ should be identical to the optimal solution to the origin problem. So if $T \geq t_1$ and $\tau \leq t_0$ the optimal solution to the subproblem would still tell you to press the button at $t=t_0$. However, if $\tau > t_0$ the control input at $t=t_0$ is not considered, but it should still consider all initial conditions, including the ones where you did or didn't press the button on $t=t_0$, and conclude that the ones where you did press the button yields a better solution (assuming that losing control yields a significant penalty to the cost function).

Why does optimal control always have optimal substructure?

There are 1 best solutions below

Related Questions in OPTIMAL-CONTROL

Related Questions in DYNAMIC-PROGRAMMING

Trending Questions

Popular # Hahtags

Popular Questions