Relationship between Definition of Adjoint of a Linear Operator and the Transpose of Matrix

1.1k Views Asked by At

I am having trouble understanding the relationship between the definition of the adjoint of a linear operator $\mathcal{A}$ $($given a non-degenerate bilinear form $\langle \ , \ \rangle)$ and the matrices representing $\mathcal{A}$ and $\langle \ , \ \rangle.$

Let $\mathcal{A} : \mathbb{R}^N \to \mathbb{R}^N, \ \langle \ , \ \rangle : \mathbb{R}^N \times \mathbb{R}^N \to \mathbb{R}$ be represented by matrices $A$ and $B$, both given by a fixed basis for $\mathbb{R}^N$. The definition of the adjoint of $\mathcal{A}$ is the unique linear operator $\mathcal{A}^*$ such that $\langle \mathcal{A}u, v \rangle = \langle u, \mathcal{A}^* v \rangle$ for every $u,v \in \mathbb{R}^N$.

On the other hand, I know that $A^* = A^T$, the transpose matrix and that we can represent everything as matrix multiplication, i.e.,

$$\langle u,v \rangle = u^TBv $$

for every $u,v \in \mathbb{R}^N$. Shouldn't this tell us that we have

$$u^T A^T B v = (Au)^TBv = \langle \mathcal{A}u,v \rangle = \langle u, \mathcal{A^*}v \rangle = u^T BA^T v, $$

for every $u,v \in \mathbb{R}^N$? This seems like this implies that $A^TB = BA^T$ for every matrix $A \in M_N(\mathbb{R})$, which seems like nonsense. Am I missing something or assuming something wrong somewhere?

3

There are 3 best solutions below

0
On BEST ANSWER

I think something crucial here is the distinction between the pre-existence of an inner product on our space. How can you speak about choosing an 'orthornormal basis' if there is no notion of an inner product?

In this answer, we will therefore consider $\langle \cdot ,\cdot \rangle$ to be an inner product on some $N$-dimensional space $V$ and $\mathcal{B}: V\times V\longrightarrow \mathbb{R}$ to be some non-degenerate bilinear form on $V$. Similarly, we will let $\mathcal{A}:V\longrightarrow V$ be a linear map on $V$.

Choose some basis $\beta=\{\mu_1,\dots,\mu_n\}$ of $V$ and let $B \in \mathbb{R}^{N\times N}$ be given by $B_{ij}=\mathcal{B}(\mu_i,\mu_j)$. Then for any $u,v\in V$ it holds that $\mathcal{B}(u,v)=\mathbf{u}^TB\,\mathbf{v}$, where $\mathbf{u},\mathbf{v}$ are the representations of $u,v$ in base $\beta$. We will heretofore refer to them as simply $u,v$.

We can similarly obtain a mqtrix representation $A$ of $\mathcal{A}$ in base $\beta$. Assume for the moment that $\mathcal{A}^*$ is uniquely defined with respect to $\mathcal{B}$ $($rather than $\langle\cdot,\cdot\rangle)$. We hence get that

$$u^T A^T B v = (Au)^TBv = \langle \mathcal{A}u,v \rangle = \langle u, \mathcal{A^*}v \rangle = u^T BA^* v, \tag{1}$$

where $A^*$ is the matrix representation of $\mathcal{A}^*$ in base $\beta$. This implies that $u^T\left(A^TB-BA^*\right)v=0$ for all $u$ and all $v$. In other words, it yields

$$A^TB=BA^*.\tag{2}$$


We can try to concretely find out $A^*$ as follows. The $i$-th column of $A^*$ is simply $A^*\mu_i$. If $\beta$ is orthonormal, then we also have that, in base $\beta$,

$$A^*\mu_i=(\langle A^*\mu_i,\mu_1\rangle, \langle A^*\mu_i,\mu_2\rangle,\dots, \langle A^*\mu_i,\mu_N\rangle)$$

By $(1)$, it follows that

$$A^*\mu_i=({\mu_1}^TA^TB\mu_i,\,{\mu_2}^TA^TB\mu_i,\,\dots,\,{\mu_N}^TA^TB\mu_i)$$

Now, $B\mu_i$ is simply the $i$-th column of $B$. In words, the previous line says that:

The $i$-th column of $A^*$ is the vector obtained from multiplifying $A^T$ by the $i$-th column of $B$.

It follows that $A^*=A^T$ if and only if $A^TB\mu_i=A^T\mu_i$ for all $i$, that is, if and only if

$$A^T(B-I)\mu_i=0$$

for all $i$. In other words, if and only if $\text{Im}(B-I)\subset \ker A^T=\text{Im}(A)^\perp$. Notice that the $\perp$ here refers to our pre-existing inner product.


A trivial consequence of our last observations is that when $B=I$ -- that is, when $\mathcal{B}$ is our pre-existing inner product --, then $A^*=A^T$. Observe that in this case, $(2)$ does hold, as it must.

0
On

As mentioned in the comments: With respect to any basis $\alpha_1,\dots,\alpha_n$, the Gram-Matrix of your bilinearform (so your $B$ in this case), looks like $$ \begin{pmatrix} \langle\alpha_1,\alpha_1\rangle & \langle\alpha_1,\alpha_2\rangle & \dots \langle\alpha_1, \alpha_n\rangle \\\langle\alpha_2,\alpha_1\rangle & \langle\alpha_2,\alpha_2\rangle & \dots \langle\alpha_2, \alpha_n\rangle \\ \vdots & \vdots & \vdots \\ \langle\alpha_n,\alpha_1\rangle & \langle\alpha_n,\alpha_2\rangle & \dots \langle\alpha_n, \alpha_n\rangle \end{pmatrix} $$ So if you assume by definition that the basis is an orthonormal basis wrt your bilinearform then $B$ is just the identity matrix and therefore commutes with everything. And this makes of course sense because if you want to represent your scalarproduct as matrixmultiplication your vectors have to be represented in coordinates wrt this basis therefore the orthonormal basis will just be represented by the standard unit vectors.

0
On

In general, for the matrix of the adjoint operator to be equal to the transpose of the operator’s matrix, these matrices need to be expressed relative to dual bases. For your question, what this means is that for the basis $\mathscr B=(v_1,\dots,v_N)$ we must have $\langle v_i,v_j\rangle=\delta_{ij}$, that is, it must be an orthonormal basis with respect to $\langle\cdot,\cdot\rangle$.