Linearity - terminology

104 Views Asked by At

i know that to be qualified as linear, a function must satisfy:

1) $f(u+v)=f(u)+f(v)$

2)$f(\lambda u)=\lambda f(u)$

and a matrix multiplication $Ax=b$ is considered linear operation because b is a linear combination of the columns of A (any linear transform = matrix).

However, when you think of a system of linear equations (here let's say n vars and n equations) such as this one:

$a_{11}x_{11}+a_{12}x_{12}+...+a_{1n}x_{1n}=y_1$

$... $

$a_{n1}x_{n1}+a_{n2}x_{n2}+...+a_{nn}x_{nn}= y_n$

if we consider the 1D case (only 2 vars), which I prefer, to make things clearer (for my question), to write in terms of $x$'s and $\beta$'s, with this time the $\beta$'s being the unknown parameters of the model, i.e., the slope and y-intercept in this specific case, (which can be written in matrix form as $X\beta=y$:

$x_{11}\beta_{11}+x_{12}\beta_{12}=y_{1}$

$x_{21}\beta_{21}+x_{22}\beta_{22}=y_{2}$

where $x_{i1} =1$ which means that we could rewrite as:

$\beta_{11}+x_{12}\beta_{12}=y_{1}$

$\beta_{21}+x_{22}\beta_{22}=y_{2}$

and therefore we have 2 equations of the form $ y=mx+b$ which if b is non-zero is normally considered affine.

So eventually, my question is about terminology: we talk about systems of linear equations, and indeed matrix multiplication is linear, however we have affine functions (lines with potentially non-zero y-intercepts)... If a kind an very knowledgeable person could clarify this for me, I would be very grateful, as I am quite concerned in such details.

3

There are 3 best solutions below

1
On BEST ANSWER

Your definition of a linear function is correct if you assume the correct definitions of the "+" and scalar multiplication operations you actually left undefined. Maybe it is beneficial

Let $X, Y$ be two vectorspaces over the same field $\mathbb{K}$. A function $f: X \to Y$ is called linear iff.

  • for every $x_1, x_2 \in X$ we have: $f(x_1 + x_2) = f(x_1) \oplus f(x_2)$
    where "+" is the addition from the vectorspace $X$ and $\oplus$ the one from $Y$.

  • for every $\lambda \in \mathbb{K}$ and $x \in X$ we have: $f(\lambda \cdot x) = \lambda \odot f(x)$
    where $\cdot$ is the scalar multiplication from the vectorspace $X$ and $\odot$ the one from $Y$.

A system of linear equations can be defined as

  • having two vectorspaces $X, Y$ (e.g. $X = Y = \mathbb{R}^2$)
  • a linear function $f: X \to Y$ between them (e.g. represented by a matrix if $X, Y$ are finite vectorspaces, in your post you called it $X$)
  • and a "target" vector $y \in Y$ (e.g. your $y = (y_1, y_2)^T$)
  • and the problem: Seek $x \in X$ such that $f(x) = y$.

The word "linear" in "system of linear equations" just refers to $f$ being linear. It does not necessarily refer to the equations representing linear functions, as you correctly saw, they are not even necessarily linear, but only affine. Indeed note that the terminology "system of equations" only makes sense if $X, Y$ are finite vectorspaces, see also the last paragraph of this post.

Let's take your example:

$\beta_{11}+x_{12}\beta_{12}=y_{1}$
$\beta_{21}+x_{22}\beta_{22}=y_{2}$

That's equivalent to

$$\begin{bmatrix}1 & x_{12} & 0 & 0\\0 & 0 & 1 & x_{22}\end{bmatrix}\begin{bmatrix}\beta_{11} \\ \beta_{12} \\ \beta_{21} \\ \beta_{22}\end{bmatrix} = \begin{bmatrix}y_1 \\ y_2\end{bmatrix}$$

Maybe you can see the resemblance to our definition above: as you seem to know, the matrix can be identified with a linear function $f$ with $(\beta_{11}, \beta{12}, \beta{21}, \beta{22})^T$ being its input argument, i.e. something from the "input" space $X$.
Hence we choose the input space $X = \mathbb{R}^4$. The output space is determined by $(y_1, y_2)^T$, thus we choose $Y = \mathbb{R}^2$. You could also argue with 4 being the number of columns and 2 being the number of rows of the matrix.
Now it's clear that this system matches our definition above with a function $f: \mathbb{R}^4 \to \mathbb{R}^2$. I believe in you being able to give the definition of $f$ yourself.


Beware that the following you said in your post is not true:

(any linear transform = matrix) [sic!]

If the linear transformation (function) is between finite vectorspaces only and you choose a fixed basis, then yes, you can express every such linear function as a matrix. That's the so-called display matrix wrt. chosen bases.

If either $X$ or $Y$ is an infinite vectorspace, i.e. bases are now infinite (totally ordered) sets, then usually display matrices are not defined. The definition of "system of linear equations" given above importantly still makes sense, though, since we nowhere mentioned matrices to begin with.

2
On

The OP seems to understand the meaning of linear and affine as used in Linear Algebra well.

A line is the graph of an affine function/transformation, so the related equations are called "linear", as in, related to a line (and before a study of Linear Algebra, the affine function might itself confusingly be referred to as a "linear function").

2
On

"a matrix multiplication $Ax=b$ is considered linear operation because b is a linear combination of the columns of A (any linear transform = matrix)." :

Actually $Ax=b$ is not an "operation" at all, it's an equation.

If $A$ is a matrix and you define $f(x)=Ax$ then $f$ is linear. It's not "considered linear" because it's a linear combination of whatever; $f$ is linear, because it satisfies the definition: $f(x+y)=f(x)+f(y)$ and $f(\lambda x)=\lambda f(x)$.