Derivation of solution for simple control problem

58 Views Asked by Bumbble Comm At 23 Feb 2026 - 1:01

While trying to understand the fundamental concepts in control theory reading the following article Dual Control for Approximate Bayesian Reinforcement Learning (chapter 3.1, "A toy problem") i came across the following solutions to a very simple problem:

Consider the linear, scalar system:

$x_{k+1} = ax_k + bu_k + \xi_k$

Where as $x_k$ denotes the state at timestep $k$, $u_k$ the control action at timestep $k$ and $\xi$ is normally distributed.

Consider the following cost function: $L(x, u) = [\sum_{k=0}^T (x_k - r_k)^T W_k (x_k - r_k) + \sum_{k=0}^{T-1}u_k^TU_ku_k]$ where as $r = [r_0...r_T]$ is a target trayectory. $W_k$ and $U_k$ define state and control cost respectively.

If $a$ and $b$ are known, the optimal $u_k$ to drive the current state $x_k$ to zero in one step can be trivially verified to be $u_{k, oracle}^* = -\frac{abx_k}{U + b^2}$

Let now parameter $b$ be uncertain, with current belief $p(b) \sim N(b; u_k, \alpha_k^2)$ at time $k$. The naive option of simply replacing the parameter with the current mean estimate is known as certainty equivalence (CE) control in the dual control literature. The resulting control law is $u_{k, ce}^* = -\frac{au_kx_k}{U + u_k^2}$

How can i derive these solutions?

Original Q&A

Derivation of solution for simple control problem

Related Questions in CONTROL-THEORY

Related Questions in OPTIMAL-CONTROL

Related Questions in LINEAR-CONTROL

Related Questions in ADAPTIVE-CONTROL

Related Questions in MODEL-PREDICTIVE-CONTROL

Trending Questions

Popular # Hahtags

Popular Questions