Motivation of Linear Functions

389 Views Asked by At

I understand that a linear function perserves basic vector space properties, and, for euclidean spaces, it preseves parallel lines, and that systems of linear equations can be solved by viewing them as a linear function. At least to me, this does not give grounds to define linear functions because these problems could be solved without such notions. What is the motivation of defining linear functions.

2

There are 2 best solutions below

0
On

I think what you are really asking is why linear functions are special enough to deserve study separate from other vector functions.

Here is a high level explanation. In science and engineering, a physical phenomena is captured by laws of physics. These laws of physics are described with mathematical equations. To solve these, the problem is discretized (meaning rather than looking at a continuous set of points, you look at a discrete set of points, using calculus to approximate). At that point, you end up with some function $ f : \mathbb{R}^n \rightarrow \mathbb{R}^m $ (of course, it could be over some other field). Frequently, one is then interested in solving the inverse problem: it is observed that $ f(x) = y $, but $ y $ is given and $ x $ is desired. So, this is equivalent to finding the roots of $ f(x) - y = 0 $.

Now from functions in 1D you remember that finding the roots of arbitrary functions is a difficult problem. But finding the roots of a linear equation is simple. To then solve (some) nonlinear equations you learned how to use Newton's method, which locally linearizes the problem. In higher dimension, you can do the same. You linearize $ f(x) $ locally as $ L(x) $ and then you solve $ L(x) = y $, as part of a Newton iteration. Bingo, you find out that linear functions are at the root of many solution methods. And therefore, they are worth studying in depth.

(Obviously, this is a very high level, somewhat naive description, but perhaps it will get you to continue to study linear functions.)

0
On

Linear functions arise in many many applications because many things are inherently linear in nature. Additionally, because linear functions are so simple it is common to model even complicated functions as being linear in some range. Below are some examples,

  • Ohm's Law states that current ($I$), voltage ($V$), and resistance ($R$) are related by the equation $$V = IR$$ which is evidently linear in $R$ for a given $I$ and vice versa
  • Consider the vector $p\in \mathbf{R}^n$ where $p_i$ is the price per unit of item $i$ and $q\in \mathbf{R}^n$ where $q_i$ is quantity purchased of item $i$. Then the total value of your inventory is $v = \sum_{i=1}^n p_iq_i$ which is evidently a linear function of $q$ for a given set of prices or linear in $p$ for a given inventory.
  • Consider a vector $x\in\mathbf{R}^n$ where $x_i$ is the sound intensity at time $i$. Throwing out every other sample (downsampling) will compress the memory the total song takes (while degrading sound quality) to give a vector $y\in\mathbf{R}^{n/2}$. This is a linear function of the input $x$
  • Color can be represented by the vector $c\in \mathbf{R}^4$ where the components are red, blue, green, and alpha (controlling the transparency). Applying linear transformations to an image can alter the color components for retouching/editing a photo.
  • A matrix $S\in \mathbf{R}^{m\times n}$ holds students scores where each row is a student and each column is a test. Multiplying the matrix on the right by $x = 1/n\cdot \mathbf{1}$ gives average test score of each student. Multiplying instead by a vector $p\in\mathbf{R}^n$ where $\sum_{i=1}^n p_i = 1$ and $p_i\ge 0$ gives a weighted average of each student's test scores. In this case the student's grade is clearly a linear function $g=Sp$
  • It's common to try to fit polynomial curves to data with equations like so $y = \theta_0 + \theta_1 x + \theta_2 x^2 + \cdots + \theta_n x^n$. This model is linear in $\theta$ and is commonly used in machine learning and data science.
  • A time series $x\in \mathbf{R}^T$ can be smoothed by generating a new output $y\in\mathbf{R}^{T-1}$ where the time new samples are averages of adjacent samples $y_i = (x_i + x_{i+1})/2$. This can be described by a linear function $y = Ax$

This is a tiny miniscule fraction of some of the applications of linear functions as they arise naturally (Google's PageRank, Principal Components Analysis, Kirchoff's voltage law, Newton's law, etc) . There are many many others in fields I probably know nothing about. For some more interesting examples have a look at Stephen Boyd's book on Applied linear algebra (section 1.4 and 6.4 have some of the examples I already mentioned)