Refining my knowledge of the imaginary number

2.5k Views Asked by At

So I am about halfway through complex analysis (using Churchill amd Brown's book) right now. I began thinking some more about the nature and behavior of $i$ and ran into some confusion. I have seen the definition of $i$ in two different forms; $i = \sqrt{-1} $ and $i^2 = -1$. Now I know that these two statements are not equivalent, so I am confused as to which is the 'correct' definition. I see quite frequently that the first form is a common mistake, but then again Wolfram Math World says otherwise. So my questions are:

  1. What is the 'correct' definition of $i$ and why? Or are both definitions correct and you can view the first one as a principal branch?

  2. It seems that if we are treating $i$ as the number with the property $i^2 = -1$, it is implied that we are treating $i$ as a concept and not necessarily as a "quantity"?

  3. If we are indeed treating $i$ as a concept rather than a "quantity", how would things such as $i^i$ and other equations/expressions involving $i$ be viewed? How would such an equation have value if we treat $i$ like a concept?

I've checked around on the various imaginary number posts on this site, so please don't mark this as a duplicate. My questions are different than those that have already been asked.

8

There are 8 best solutions below

4
On
  1. As you probably know, there are two solutions to $x^2+1=0$. We arbitrarily call one of them $i$ and the other $-i$, but this choice could have been made the other way (which is why complex conjugation is an automorphism of $\mathbb C$). So, both definitions are essentially correct, but of course, the first one is slightly less correct.

  2. What is the difference between a "concept" and a "quantity"? This question seems more philosophical than mathematical...

  3. Well, $i^i$ is usually defined as $e^{i\log i}$, where $e^{a+bi}=e^a(\cos b+i\sin b)$ and $\log(re^{i\theta})=\ln r+i(\theta+2n\pi)$, where $n$ is any integer (the logarithm is a multivalued function since the exponential function is not injective)

5
On

The definition of complex number is given on page 1 of Churchill and Brown's book:

Complex numbers can be defined as ordered pairs $(x, y)$ of real numbers…

The definition of $i$ is given on page 2:

…let $i$ denote the pure imaginary number $(0,1)$…

So to answer your question, $i$ is not defined by the equation $i=\sqrt{-1}$, nor is it defined by the equation $i^2=-1$. Instead, it is defined as a particular ordered pair of real numbers, $i=(0,1)$. Then, given the definition of complex multiplication, one proves that $i^2=-1$.

3
On

it is easy to become the captives of our preconceptions. if you have honed a very efficient machine $\sqrt{}$ for extracting the square roots of positive real numbers you will rightly rub your head in puzzlement if asked to operate it on $-1$. but rather than focus on the genesis of $i$, celebrate the vistas it opens up, with its quantum leap through the sign barrier which constrains the arithmetic of real numbers. that one magical property $i^2=-1$ has far-reaching consequences.

take, for example, how $i$ interacts with another very basic mathematical reality - the circular functions and their remarkable combinatorial properties. it can be shown on the basis of geometry that: $$ \def\c{\cos}\def\s{\sin} \c(x+y)=\c(x)\c(y)-\s(x)\s(y) $$ and $$ \s(x+y)=\s(x)\c(y)+\s(y)\c(x) $$ using these identities it is easy to give an inductive proof that for any integer $n \ge 0$ we have: $$ (\c(x) + i \s(x))^n = \c(nx) + i \s(nx) $$ this is easily extended to rational values of $n$ and Lo! suddenly you have the infinite family of roots of unity at your disposal. want a number whose fifth power is $1$? well, you could always have $1$! but now there are also $e^{\frac{2 \pi i}{5}},e^{\frac{4 \pi i}{5}},e^{\frac{6 \pi i}{5}}$ and $e^{\frac{8 \pi i}{5}}$. of course in order to win the right to describe them as exponentials you have to plug $x=i\theta$ into the McLaurin expansion for $e^x$ to see how this gives rise to the identity: $$ e^{i\theta}=\cos\theta +i\sin\theta $$ in short, learn to work with $i$, enjoy it! you can of course explore various dusty definitions, but $i$ is such a special number that you cannot really assimilate it to any previous mode of mathematical thinking. look at the transformations it has wrought in physics over the last 150 years.

0
On

I would like to add a bit to Chris Culter's excellent answer. It is possible that there is more than one possible correct definition for $i$. The definition Chris Culter quoted from your book is one such. One might ask if the definitions you quoted are also correct. The answer is that neither one is.

The expression “$\sqrt x$” is defined to mean the unique non-negative real number $y$ such that $y^2 = x$. This is a good definition when $x$ is a non-negative real number, but it is completely meaningless for other values of $x$, because for $x = -1$, say, there is no non-negative real number $y$ such that $y^2 = -1$. So an attempt to define $i$ by saying $i=\sqrt{-1}$ is an immediate failure; there is no such thing. The $\sqrt{}$ operator simply does not do that.

Going the other way, one can't define $i$ by simply saying that $i^2=-1$; that is not a definition. It describes the properties that we want $i$ to have, but it does not describe any object that actually has those properties, and there may not actually be one. It's very easy to assemble a list of properties that is not possessed by any object. For example, “the largest invisible purple water buffalo in Dubuque, Iowa” is a description of that sort; you can describe its properties, but there is no such thing. Or again, a recent question here asked for the volume of a polyhedron whose faces are five equilateral triangles. But there is no such polyhedron: every polyhedron with only triangular faces has an even number of them.

We have an idea of a number $i$ with $i^2=-1$, but in order to show that this idea is coherent, we must actually construct something that behaves that way. One way to do this is the construction in your book:

  1. We take the set of pairs of real numbers $\langle a,b\rangle$
  2. Define addition and multiplication operations on them
  3. Show that these operations have the properties one usually expects, such as $a\cdot(b+c) = (a\cdot b ) + (a\cdot c)$.
  4. Show that the resulting structure contains a proper substructure, namely the set of elements of the form $\langle a, 0\rangle$, that behaves just like the real numbers, so we can identify this substructure as the real numbers, and the element $\langle -1, 0\rangle$ is effectively the same as the real number $-1$
  5. Show that this structure contains an element, namely $i=\langle 0, 1\rangle$ (or $i=\langle 0,-1\rangle$ if you prefer), which has the property that $i\cdot i = -1$.

This is not the only way to define $i$; there are many structures we could invent that would behave the way we want the complex numbers to behave. In advanced algebra courses, one uses the same strategy as in the previous paragraph, but one constructs the complex numbers as classes of polynomials instead of as ordered pairs:

  1. Divide the polynomials into certain classes.
  2. Define addition and multiplication of these classes
  3. Show that these operations have the usual properties
  4. Show that some of the classes behave the same way that the real numbers behave, and in particular that there is a class $[-1]$ that corresponds to the real number $-1$
  5. Show that there is a certain class $[x]$ which has the property that $[x]\cdot[x] = [-1]$.

So there is more than one possible definition, but neither one of the suggestions you gave is enough to do it. Vladimirm's answer elsewhere in this thread shows yet another way to define $i$, again following the same basic strategy.

3
On

You could think of complex numbers as special linear transformations, of the form $\begin{bmatrix} a & -b \\ b & a\end{bmatrix}$ where this matrix represents the complex number $z = a + ib$, a transformation between points in $\mathbb{R}^2$. This matrix represents a rotation and uniform scale transformation, where you first rotate your point and then scale it. This can perhaps be better observed from the exponential representation, if $z = re^\theta$ then the corepsonding transformation matrix would be

$$ \begin{bmatrix} a & -b \\ b & a\end{bmatrix} = \begin{bmatrix} r & 0 \\ 0 & r\end{bmatrix}\begin{bmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta)\end{bmatrix} = \begin{bmatrix} r\cos(\theta) & -r\sin(\theta) \\ r\sin(\theta) & r\cos(\theta)\end{bmatrix} $$

You multiply two complex numbers by multiplying their coresponding matrices, and multiplication of matrices of this form is comutative. Then $0,1$ and $i$ are defined

$$ 0 = \begin{bmatrix} 0 & 0 \\ 0 & 0\end{bmatrix} \\ 1 = \begin{bmatrix} 1 & 0 \\ 0 & 1\end{bmatrix} \\ i = \begin{bmatrix} 0 & -1 \\1 & 0\end{bmatrix} $$

As you can see when complex numbers are defined this way there is nothing mystical about the number $i$. It's just another transformation, a $90^{\circ}$ counterclockwise rotation.

EDIT

Also this way of thinking about complex numbers intuitively explains why the Cauchy-Riemann equations must hold for differentiable functions.

A complex function $f(z) = v(x,y) + iu(x,y)$ can be viewed as a simple $\mathbb{R}^2 \to \mathbb{R}^2$ transform (from "xy" coordinate sys. to "uv" ). So the best linear aproximation of the function $f$ near the point $z = \begin{bmatrix} x \\ y \end{bmatrix}$ (if the function is differentiable at z) is the Jacobian matrix $\begin{bmatrix} u_x & v_y \\ u_y & v_y \end{bmatrix}$ which is a generalization of the derivative in more than one dimension. $$f(z + \begin{bmatrix} \Delta x \\ \Delta y \end{bmatrix}) \approx f(z) + \begin{bmatrix} u_x & v_y \\ u_y & v_y \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta y \end{bmatrix}$$

But because $f$ is a complex function, differentiability at $z$ means

$$f( z + \Delta z ) = f(z) + f'(z)\Delta z +o(\Delta z)$$

Where $\Delta z = \Delta x + i\Delta y$. This means that the function is linear near $z$ ($\Delta z \to 0$), and therefore

$$f( z + \Delta z ) \approx f(z) + f'(z)\Delta z$$

where $f'(z)$ is a complex number which when represented by a linear transformation is actualy the Jacobian of this function. But because all complex numbers are special linear transformations of the form $\begin{bmatrix} a & -b \\ b & a\end{bmatrix}$ the Cauchy-Riemann equations follow, $u_x = v_y, v_x = -u_y$.

0
On

The remark "set $i^2=−1$ and assume all other algebraic properties work as usual" (quoted by @JackM in the comments) is indeed vague, but it's a good intuition to hold on to (the other answers here are more precise than what I'm about to say, but in their detail miss what I think is the key insight when teaching complex numbers).

You know about the real numbers: you can add, subtract, multiply and divide; you can therefore do squares and take the square roots of some numbers; and all this makes sense, or hangs together, in the sense that there are properties such as "if $a = 1/b$, then $a \cdot b = 1$", and so on.

You know about matrices: you can add, subtract and multiply these, and while there are properties such as "if $A = -B$ then $A+B=0$" for addition and subtraction, there's no way of defining division by a matrix in any way which respects the same sort of arithmetical rules that we have for Reals.

Now imagine there's a thing called $i$. We don't care (for the moment) what it 'is' nor how to represent it, but we decide that it has the property that $i \cdot i= -1$ (so it's clearly not a Real number). Can we do anything with this number?

The answer is yes, we can, and the elementary introduction to complex numbers consists of demonstrating that we can define addition, subtraction, multiplication and division of complex numbers in a way that results in these operations having the same properties that we find in the arithmetic of the Real numbers.

This is a very Big Deal, not least because it is (historically) one of the first suggestions that those rules of arithmetic are not specific to the Real numbers, but a possible object of study themselves (and further down this road lies group theory, and the study of rings, and modules, and all that jazz).

Just by the way: One of my undergraduate epiphanies when studying complex analysis – which is the attempt to do calculus with complex numbers rather than just reals – is the point when I realised what the difference was between complex analysis and (2D) vector analysis: only in complex analysis can you define the derivative using plain old $\lim_{{\mathrm d}x\to 0} [f(x+{\mathrm d}x)-f(x)]/{\mathrm d}x$, where ${\mathrm d}x$ is a complex number, because only in the complex plane can you divide by ${\mathrm d}x$ in this way; differentiation has to be defined in a more roundabout way on the 2D plane, precisely because there's no operation of 'dividing by a vector'. It's that fact (well, that plus the Cauchy-Riemann condition) that gives the complex plane the huge (amazing) amount of structure that complex analysis reveals it has.

2
On

Complex numbers can be constructed in different ways. One such way (the most common) is to consider ordered pairs of real numbers and define operations on them. In this construction you expand $\Bbb R$ by embedding it in $\Bbb R^2$ (a real number $r$ becomes the pair $(r, 0)$), you call $i$ the element $(0,i)$ and, by defining appropriate operations on this set of pairs, it happens that $i^2=(-1,0)\equiv -1$.

Another construction (less common, but more explanatory of the "nature" of complex numbers and the imaginary number $i$) is to consider $\Bbb C$ as the splitting field of $x^2+1$ over $\Bbb R$. You take the set of real polynomials $\Bbb R[x]$ and consider $$\left.\Bbb R[x]\right/(x^2+1),$$that is the quotient of the domain $\Bbb R[x]$ with the ideal generated by the polynomial $x^2+1$. Now, our $\Bbb C$ is this set, equipped with its operations (induced from the operations on $\Bbb R[x]$). Here we call $i$ the element $(x^2+1)+x$, because this is the splitting element, i.e., a root of $x^2 +1 $. In this construction (equivalent up to isomorphism to the other) you can directly see the property of the element $i$, the fact that $i^2=-1$. In fact, in this construction, you take the field $\Bbb R$ and expand it by adding the roots of the polynomial $x^2+1$. Being $\Bbb C$ the splitting field of $(x^2+1)$ over $\Bbb R$ means that $\Bbb C$ is the smallest extension of $\Bbb R$ that contains the roots of $x^2 + 1$. So $\Bbb C= \Bbb R(i)$. This is why all elements of $\Bbb C$ can be written as $a+ib$ for real $a$ and $b$.

0
On

The "nature" of a thing is what you can use it for. In your complex analysis class, you will learn how they help you solve algebraic equations, simplify trigonometry, understand power series, do computations in integral and differential calculus, do geometry, and other things. In other courses, you may learn how to use them to do Fourier analysis, compute with quantum mechanics, motivate ideas in algebraic geometry, or any number of things.

All of those things is the "nature" of the complex numbers.

It may be interesting that the very origins of the subject were "Huh, if I pretend I can take square roots of negative numbers and push symbols around as if they were actually numbers, I can find the correct solutions to cubic and quartic equations." This, incidentally, is why they were originally called "imaginary" numbers. (and, unfortunately, the term has stuck even though we've known better for centuries)

I vaguely recall that, except for finance, the use of negative numbers in in calculations was also novel at this time, and their use in solving cubic equations was a big part of helping those go mainstream too. (Someone correct me if this is wrong)

The point of a definition is for exposition -- to give you enough initial bits of information so that you can start using the tools to see what they can do and how they fit together. The fine details about what is actually a definition and how to set up the foundations of a subject are not important at all to using and understanding complex analysis.

However, understanding such things is useful as examples of how to go about construct your own mathematical objects when you need to do such things in the future. And possibly useful as a test case if you're interested in studying formal logic.