I am deriving some equations and need to know the correct mathematical notation for opening up the brackets of an equation with the following variables: tensor $A \in$ ${\mathbb R}^{l \times l \times l}$, a vector $\mathbf{x} \in {\mathbb R}^{l \times 1}$ and a vector $\mathbf{y} \in {\mathbb R}^{l \times 1}$.
I need to open up the following brackets, specifically interested in multiplying out the $A\mathbf{x}$ term: $(something + A\mathbf{x})(\mathbf{y} + \mathbf{x})$.
What is correct? For first term $\mathbf{y}^TA\mathbf{x}$ or $\mathbf{x}^TA^T\mathbf{y}$ or $(A\mathbf{x})\mathbf{y}$ or $A\mathbf{x}\mathbf{y}$?
Equally for the second term: $\mathbf{x}^TA\mathbf{x}$ or $\mathbf{x}^TA^T\mathbf{x}$ or $(A\mathbf{x})\mathbf{x}$ or $A\mathbf{x}\mathbf{x}$?
My actual derivations are below:
A commonly studied nonlinear system describing the evolution of the state takes the form \begin{equation} \dot{\mathbf{x}}(t) = \mathbf{f}(\mathbf{x}(t),\mathbf{u}(t)) = \widetilde{A}(\mathbf{x}(t))\mathbf{x}(t) + \widetilde{B}(\mathbf{x}(t))\mathbf{u}(t) \end{equation} where $\widetilde{A}(\mathbf{x}(t)) \in \mathbb{R}^{{l}\times{l}}$ and $\widetilde{B}(\mathbf{x}(t)) \in \mathbb{R}^{{l}\times{m}}$ are the state and control weight matrices respectively, which are nonlinear in state.
We know that matrices $A(\mathbf{x}(t))$ and $B(\mathbf{x}(t))$ are partial derivatives of $\dot{\mathbf{x}}(t)$ with respect to the state $\mathbf{x}(t) \in\mathbb{R}^{l}$ and control $\mathbf{u}(t) \in \mathbb{R}^{m}$ variables respectively, therefore \begin{align} % 1 ------------------------------------------------------------------------------------------------ \frac{\partial \mathbf{f}(\mathbf{x}(t),\mathbf{u}(t)) } {\partial \mathbf{x}(t) } &= \frac{\partial \bigg(\widetilde{A}(\mathbf{x}(t))\mathbf{x}(t) + \widetilde{B}(\mathbf{x}(t))\mathbf{u}(t)\bigg) } {\partial \mathbf{x}(t) } \\ &= \widetilde{A}(\mathbf{x}(t)) + \frac{\partial \widetilde{A}(\mathbf{x}(t)) } { \partial\mathbf{x}(t) }\mathbf{x}(t) + \frac{\partial \widetilde{B}(\mathbf{x}(t)) } { \partial \mathbf{x}(t) }\mathbf{u}(t)\\ % 2 ------------------------------------------------------------------------------------------------ \frac{\partial \mathbf{f}(\mathbf{x}(t),\mathbf{u}(t)) } {\partial \mathbf{u}(t) } &= \frac{\partial \bigg(\widetilde{A}(\mathbf{x}(t))\mathbf{x}(t) + \widetilde{B}(\mathbf{x}(t))\mathbf{u}(t)\bigg) } {\partial \mathbf{u}(t) } = \widetilde{B}(\mathbf{x}(t)) \end{align} where $\frac{\partial \widetilde{A}(\mathbf{x}(t)) } {\partial \mathbf{x}(t) } \in \mathbb{R}^{{l}\times{l}\times{l}}$ and $\frac{\partial \widetilde{B}(\mathbf{x}(t)) } { \partial \mathbf{x}(t) } \in \mathbb{R}^{{l}\times{m}\times{l}}$ are $3$-dimensional tensors, while $\frac{\partial \widetilde{A}(\mathbf{x}(t)) } {\partial \mathbf{x}(t) }\mathbf{x}(t) \in \mathbb{R}^{{l}\times{l }}$ and $\frac{\partial \widetilde{B}(\mathbf{x}(t)) } { \partial \mathbf{x}(t) }\mathbf{u}(t) \in \mathbb{R}^{{l}\times{m}}$ are $2$-dimensional matrices.
The nonlinear dynamics (see Eq of $\dot{\mathbf{x}}(t)$) can be linearised around a point ($\mathbf{a}_{x}^{(i)},\mathbf{a}_{u}^{(i)}$), $i=1,...,I$ using first order Taylor approximation \begin{align} % 1 ------------------------------------------------------------------------------------------------ \dot{\mathbf{x}}(t) &= \mathbf{f}(\mathbf{a}_{x}^{(i)},\mathbf{a}_{u}^{(i)}) + \Bigg( \left.\frac{\partial \mathbf{f}(\mathbf{x}(t),\mathbf{u}(t))}{\partial \mathbf{u}(t)} \right|_{\substack{\mathbf{x}(t)=\mathbf{a}_{x}^{(i)}\\\mathbf{u}(t)=\mathbf{a}_{u}^{(i)}}} \Big( \mathbf{u}(t) - \mathbf{a}_{u}^{(i)} \Big) \Bigg) + \Bigg( \left.\frac{\partial \mathbf{f}(\mathbf{x}(t),\mathbf{u}(t))}{\partial \mathbf{x}(t)} \right|_{\substack{\mathbf{x}(t)=\mathbf{a}_{x}^{(i)}\\\mathbf{u}(t)=\mathbf{a}_{u}^{(i)}}} \Big( \mathbf{x}(t) - \mathbf{a}_{x}^{(i)} \Big) \Bigg) \nonumber \\ % 2 ------------------------------------------------------------------------------------------------ &= \Bigg( \widetilde{A}(\mathbf{a}_{x}^{(i)})\mathbf{a}_{x}^{(i)} + \widetilde{B}(\mathbf{a}_{x}^{(i)})\mathbf{a}_{u}^{(i)} \Bigg) + \Bigg( \widetilde{B}(\mathbf{a}_{x}^{(i)}) \Big( \mathbf{u}(t) - \mathbf{a}_{u}^{(i)} \Big) \Bigg) \nonumber \\ % 3 ------------------------------------------------------------------------------------------------ & + \Bigg( \widetilde{A}(\mathbf{a}_{x}^{(i)}) + \frac{\partial\widetilde{A}(\mathbf{x}(t))}{\partial \mathbf{x}(t)}\bigg|_{x(t)=\mathbf{a}_{x}^{(i)}} \mathbf{a}_{x}^{(i)} + \frac{\partial\widetilde{B}(\mathbf{x}(t))}{\partial \mathbf{x}(t)}\bigg|_{x(t)=\mathbf{a}_{x}^{(i)}} \mathbf{a}_{u}^{(i)} \Bigg) \bigg( \mathbf{x}(t) - \mathbf{a}_{x}^{(i)} \bigg) \end{align}
Now come my actual questions:
- I am highly concerned that this is incorrect: $\frac{\partial \widetilde{B}(\mathbf{x}(t)) } { \partial \mathbf{x}(t) }\mathbf{u}(t)$, because $\frac{\partial \widetilde{B}(\mathbf{x}(t)) } { \partial \mathbf{x}(t) }$ is $\in \mathbb{R^{l \times m \times l}}$ and $\mathbf{u}$ is $\in \mathbb{R^{m \times 1}}$. Therefore I don't think I can multiply the derivative by the $\mathbf{u}$ but how do I write this differentiation using the product rule then?
- When I open up the brackets what is the answer? $\Bigg( \widetilde{A}(\mathbf{a}_{x}^{(i)}) + \frac{\partial\widetilde{A}(\mathbf{x}(t))}{\partial \mathbf{x}(t)}\bigg|_{x(t)=\mathbf{a}_{x}^{(i)}} \mathbf{a}_{x}^{(i)} + \frac{\partial\widetilde{B}(\mathbf{x}(t))}{\partial \mathbf{x}(t)}\bigg|_{x(t)=\mathbf{a}_{x}^{(i)}} \mathbf{a}_{u}^{(i)} \Bigg) \bigg( \mathbf{x}(t) - \mathbf{a}_{x}^{(i)} \bigg) $
If $A$ takes one vector to give a linear application on $\mathbb{R}^\ell$, then you can write in coordinates $x=x^{j}$, $y=y^{k}$ and $A=A^{i}_{jk}$ where $i,j,k\in\{1,\ldots,\ell\}$ and then write $A^{i}_{jk}x^jy^k$. This is the abstract index notation : $A^{i}_{jk}x^jy^k$ is the $i$-th coordinates of the vector obtained by applying $A$ to both vectors $x$ and $y$.
If you want to keep your vectorial notation, then for the first term you can write $A\textbf{x}\textbf{y}$ or $\left(A\textbf{x}\right)\textbf{y}$ or also $\left(A\textbf{x}\right)\left(\textbf{y}\right)$ because $A\textbf{x}$ is a linear map on $\mathbb{R}^\ell$. Take care of the notation $\textbf{y}^T\textbf{z}$ which is the inner product of $\mathbb{R}^\ell$ of $\textbf{y}$ with $\textbf{z}$ (both vectors!).