Why did mathematicians choose the inner product to be linear in the first argument instead of the second?

780 Views Asked by At

From my limited experience with inner product spaces, it seems like the inner product being linear in the second argument would facilitate smoother notation. For instance, for $x \in H$, we could define $x^* \in H^*$ by $$x^*y = \langle x, y\rangle $$ Then this would generalize the fact that $x^T y = \langle x, y\rangle$ on $\mathbb{R}^n $.

Does linearity in the first argument make for smoother notation in some other aspect of Hilbert space theory?

1

There are 1 best solutions below

1
On BEST ANSWER

I have taught linear algebra using both conventions and I agree with your conclusion. I found the "physicist" convention having more advantages than disadvantages when working over $\mathbb{C}$ (or working simultaneously over $\mathbb{F}$ where $\mathbb{F} \in \left \{ \mathbb{R}, \mathbb{C} \right \}$). Those include:

  1. It is now standard that vectors are identified with column vectors while covectors are identified with row vectors. Thus, the standard inner product on $\mathbb{R}^n$ is written in terms of matrix product as $\vec{x}^T \cdot \vec{y}$ (and cannot be written as $\vec{x} \cdot \vec{y}^T$). By replacing $T$ with $*$, one gets a standard inner product $\vec{x}^{*} \cdot \vec{y}$ on $\mathbb{C}^n$ which generalizes the real case and is naturally anti-linear in the first variable. In order to describe the standard inner product using a linear-in-the-first-variable convention on column vectors, one must define $\left< \vec{x}, \vec{y} \right> = \vec{y}^{*} \cdot \vec{x}$ which is more awkward.
  2. The Riesz anti-isomorphism $V \mapsto V^{*}$ is given by $v \mapsto \left< v, \cdot \right>$. This is consistent with the idea that "$v$ acts on some vector $w$ by $\left< v, w \right>$" and is even clearer with the bra-ket notation in which a vector $v \in V$ defines a linear functional $\left< v \right|$ by $\left< v \right|(w) := \left< v \, | \, w \right>$. This imposes the requirement that the inner product is linear in the second variable.
  3. The expansion of a vector $v$ in an orthonormal basis $(e_1,\dots,e_n)$ is written as $\sum_{i=1}^n \left< e_i, v \right> v$ which is consistent with the dual space notation $\sum_{i=1}^n e^i(v) v$ where $e^i$ is $i$-th element in the dual basis which gives you the $i$-th coordinate of a vector.
  4. The matrix coefficients of a linear operator $T$ with respect to an orthonormal basis $e_1,\dots,e_n$ are given by $a_{ij} = \left< e_i, T(e_j) \right>$ (as opposed to $a_{ij} = \left< T(e_j), e_i \right>$ which is more awkward) while the matrix coefficient of $T^{*}$ are given by $\left< e_j, T(e_i) \right>$ (as opposed to $\left< T(e_i), e_j \right>$...).

The only mildly annoying thing I noticed with the "physicist" convention is that the defining property for the adjoint operator is naturally written as $\left< T^{*}v, w \right> = \left< v, Tw \right>$ while I was used to the form $\left< Tv, w \right> = \left< v, T^{*}w \right>$. Both forms are equivalent but if one wants to use the Riesz anti-isomorphism to justify the existence of $T^{*}$, the form $\left< T^{*}v, w \right> = \left< v, Tw \right>$ is more natural and takes some time getting used to.