Why do we need an orthonormal basis to represent the adjoint of the operator?

2k Views Asked by At

For any linear operator on a finite dimensional Inner Product Space, we can get orthonormal basis via Gram Schmidt Process.

But what is the necessity of defining the adjoint of the operator using the orthonormal basis?

Probably it helps with the computation. Why we happen to define like that?

3

There are 3 best solutions below

2
On

The adjoint of an operator depends on the inner product you use; it's not a purely Linear Algebraic concept: $(Ax,y)=(x,A^{\star}y)$. If you represent a linear operator on a finite dimensional inner product space $X$ with respect to an orthonormal basis $\mathscr{B}=\{ e_1,e_2,\cdots,e_N \}$, then the adjoint operator $A^{\star}$ has a matrix representation equal to the conjugate transpose of the representing matrix for $A$. That is, $$ [A^{\star}]_{\mathscr{B}} = (\overline{[A]_{\mathscr{B}})}^{T}=([A]_{\mathscr{B}})^{\star} $$ If you don't use an orthonormal basis, then the matrix for the adjoint $A^{\star}$ is not so easily represented in terms of $[A]_{\mathscr{B}}$.

0
On

I think you're saying where the necessity of ONB comes from the proof of "$(A^*)_{i,j} = \overline {A_{j,i}}$, if $A=\varphi_{E}^{E}$ in an ONB".

$<\varphi^*(e_j),e_{i}> = <e_j,\varphi(e_i)>$ by definition. And $<e_j,\varphi(e_i)> = \overline{<\varphi(e_i),e_j>}$.

Hence we have $<\varphi^*(e_j),e_{i}>=\overline{<\varphi(e_i),e_j>}$.

We want to relate $<\varphi^*(e_j),e_{i}>$ with the $i,j$ th entry of the adjoint matrix $A^*$, and $\overline{<\varphi(e_i),e_j>}$ with the $j,i$ th entry of the original matrix A. And it's ture when A is in an orthonormal basis.

This is how I'm thinking about where the necessity comes.

0
On

We are free to define what is meant by adjoint of an operator and adjoint of a matrix without any mention of a basis, orthonormal or otherwise. Indeed, we usually don't mention bases in either definition. Taking $\mathbb{F}$ to be either $\mathbb{R}$ or $\mathbb{C}$, the definitions may be stated as:

If $V$ and $W$ are finite-dimensional inner product spaces over $\mathbb{F}$, and $T:V\to W$ is linear, then the adjoint operator $T^{*}:W\to V$ is the unique operator with the property that$$\left<Tv,w\right>=\left<v,T^{*}w\right>$$ for all $v\in V$ and for all $w\in W$.

If $\mathbf{A}$ is a matrix with entries in $\mathbb{F}$, then the adjoint of $\mathbf{A}$ is$$\mathbf{A}^{*}=\overline{\mathbf{A}^{\top}}\mbox{.}$$ But when we define two meanings for the same word, we'd like the two meanings to be somehow related. In the case of the word adjoint, if the matrix $\mathbf{A}$ represents the operator $T$ with respect to bases $\alpha$ and $\beta$ of $V$ and $W$, respectively, then we'd like the adjoint of $\mathbf{A}$ to coincide with the matrix of the adjoint of $T$ with respect to $\beta$ and $\alpha$. That is, we want$$\left(\left[T\right]_{\alpha}^{\beta}\right)^{*}=\left[T^{*}\right]_{\beta}^{\alpha}\mbox{.}$$ The last equation is NOT true in general, but it is true when both $\alpha$ and $\beta$ are orthonormal. So that's where orthonormality becomes “necessary” in a sense. This is a result, however, not a definition. And even with the same definitions above, we can still write $\left[T^{*}\right]_{\beta}^{\alpha}$ in terms of $\left[T\right]_{\alpha}^{\beta}$ without assuming $\alpha$ and $\beta$ are orthonormal. Letting $\alpha=\left\{ \alpha_{1},\ldots,\alpha_{m}\right\}$ and $\beta=\left\{\beta_{1},\ldots,\beta_{n}\right\}$, the formula in general is$$\left[T^{*}\right]_{\beta}^{\alpha}=\mathbf{C}^{-1}\left(\left[T\right]_{\alpha}^{\beta}\right)^{*}\mathbf{B}$$ where$$\mathbf{C}=\left(\begin{array}{cccc} \left<\alpha_{1},\alpha_{1}\right> & \left<\alpha_{2},\alpha_{1}\right> & \cdots & \left<\alpha_{m},\alpha_{1}\right>\\ \left<\alpha_{1},\alpha_{2}\right> & \left<\alpha_{2},\alpha_{2}\right> & \cdots & \left<\alpha_{m},\alpha_{2}\right>\\ \vdots & & & \vdots\\ \left<\alpha_{1},\alpha_{m}\right> & \left<\alpha_{2},\alpha_{m}\right> & \cdots & \left<\alpha_{m},\alpha_{m}\right> \end{array}\right)$$ and$$\mathbf{B}=\left(\begin{array}{cccc} \left<\beta_{1},\beta_{1}\right> & \left<\beta_{2},\beta_{1}\right> & \cdots & \left<\beta_{n},\beta_{1}\right>\\ \left<\beta_{1},\beta_{2}\right> & \left<\beta_{2},\beta_{2}\right> & \cdots & \left<\beta_{n},\beta_{2}\right>\\ \vdots & & & \vdots\\ \left<\beta_{1},\beta_{n}\right> & \left<\beta_{2},\beta_{n}\right> & \cdots & \left<\beta_{n},\beta_{n}\right> \end{array}\right)\mbox{.}$$When orthonormal bases of $V$ and $W$ are not readily available, the formula above for $\left[T^{*}\right]_{\beta}^{\alpha}$ is usually more computationally efficient than applying Gram-Schmidt and change of basis matrices. However, the formula above assumes that inner products are linear in the first slot. If one prefers the definition of inner product which requires linearity in the second slot, then replace $\mathbf{C}$ and $\mathbf{B}$ above by their transposes.