Solution to Average of Several Trails of Dicrete Time LQR with Noise

40 Views Asked by At

The solution to discrete time finite horizon LQR problem is well studied. We have the linear system $$x_{k+1}=A x_{k}+B u_{k}+w_k$$ where $w_k$ is a random variable with mean $0$ and finite second moment, and we want to minimize $$J(\pi)=\mathbb{E} \{x_{N}^{\top} P x_{N}+\sum_{k=0}^{N-1} x_{k}^{\top} Q x_{k}+u_{k}^{\top} R u_{k}\}$$.

Given an input $x_0$, and sample i.i.d noise at each time step, we can recursively find optimal $u_k$.

Now, my question is, what if we want to mininize an "empirical" version of the above expectation, which is an average of $T$ trails: $$J = \frac{1}{T}\sum_{i=0}^{T} \big(x_{N}^{i\top} P x_{N}^{i}+\sum_{k=0}^{N-1} x_{k}^{i\top} Q x_{k}^{i}+u_{k}^{\top} R u_{k}\big)$$ where $$x_{k+1}^i=A x_{k}^i+B u_{k}+w_k^i$$

Of course, the initial point of each trail is the same, i.e., $x_0^i=x_0$ for all $i$. The key point here is that we want to find a SINGLE control sequence that minimize the empirical mean of several indepedent trails, where the state may be different due to indepdent noise $w_k^i$. Any existing theory or ideas to deal with the problem? Thanks!