Recently, while learning about the regularization methodologies(in machine learning), i came across orthogonal polynomials ( of Legendre's polynomials), I looked up on the Internet and there are formulas for the same, but i didn't get any intuition behind this. Can someone explain in simple terms what are these orthogonal polynomials and when should we use them?
2026-03-26 05:54:39.1774504479
Intuition of orthogonal polynomials?
5.5k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in POLYNOMIALS
- Alternate basis for a subspace of $\mathcal P_3(\mathbb R)$?
- Integral Domain and Degree of Polynomials in $R[X]$
- Can $P^3 - Q^2$ have degree 1?
- System of equations with different exponents
- Can we find integers $x$ and $y$ such that $f,g,h$ are strictely positive integers
- Dividing a polynomial
- polynomial remainder theorem proof, is it legit?
- Polyomial function over ring GF(3)
- If $P$ is a prime ideal of $R[x;\delta]$ such as $P\cap R=\{0\}$, is $P(Q[x;\delta])$ also prime?
- $x^{2}(x−1)^{2}(x^2+1)+y^2$ is irreducible over $\mathbb{C}[x,y].$
Related Questions in ORTHOGONAL-POLYNOMIALS
- Is there something like "associated" Chebyshev polynomials?
- What is the difference between Orthogonal collocation and Weighted Residual Methods
- Calculate Stieltjes Polynomial
- How do I show this :$\int_{-\infty}^{+\infty} x^n 2\cosh( x)e^{-x^2}=0$ if it is true with $n$ odd positive integer?
- Gegenbauer functions and applications (esp. circular envelope special case)?
- Calculating coefficient of approximation polynomial which is expanded in to a series of Legendre Polynomials
- If $P_n(1)=1$ calculate $P'_n(1)$ in Legendre polynomials
- Linear Functional and Orthogonal polynomial sequence relation
- Show that if $\{P_n\}$,$ n\geq0$ is orthogonal with respect to a linear functional $L$ then the following two are equivalent.
- Orthogonality and norm of Hermite polynomials
Related Questions in LEGENDRE-POLYNOMIALS
- Why is Legendre's polynomial the solution to the theta equation in the Schrödinger's equation of a hydrogen atom?
- Calculate Stieltjes Polynomial
- How do I turn $\cos(3\theta)$ into a polynomial?
- Legendre polynomials: show that two algorithms construct the same polynomials
- Asymptotic form of the associated Legendre polynomials $P^m_l (z)$ at z=1
- Calculating coefficient of approximation polynomial which is expanded in to a series of Legendre Polynomials
- Proving orthogonality of Legendre polynomials
- If $P_n(1)=1$ calculate $P'_n(1)$ in Legendre polynomials
- Proving a result related to Legendre polynomials
- How to prove that Legendre polynomials satisfy $P_n(x)\leq 1$
Related Questions in REGULARIZATION
- Zeta regularization vs Dirichlet series
- Uniform convergence of regularized inverse
- Composition of regularized inverse of linear operator on dense subspace converges on whole space?
- Linear Least Squares with $ {L}_{2} $ Norm Regularization / Penalty Term
- SeDuMi form of $\min_x\left\{\|Ax-b\|_2^2 + \lambda\|x\|_2\right\}$
- Solving minimization problem $L_2$ IRLS (Iteration derivation)
- How to utilize the right-hand side in inverse problems
- How Does $ {L}_{1} $ Regularization Present Itself in Gradient Descent?
- Proof in inverse scattering theory (regularization schemes)
- Derivation of Hard Thresholding Operator (Least Squares with Pseudo $ {L}_{0} $ Norm)
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
Okay, here is a primer on the stuff you need to know to see why we like Orthogonal Polynomials.
Vectors
In physics, vectors are usually things that live in three-dimensional space. In computer science, they tend to be lists of numbers. In mathematics, a vector is a thing that lives in a vector space. A vector space is designed so that all the simple things you want to do to vectors work: you can add them, there's a zero and additive inverses, you can scale them by multiplying by numbers, which are things in the field of scalars (which we can assume is $\mathbb{R}$ for the purposes of this discussion, so we are talking about real vector spaces), and all the usual properties extend from the vector space $\mathbb{R}^3$ over the field $\mathbb{R}$ that physics uses (what does not extend at this stage is the notion of product).
A vector space has a positive integer called the dimension associated to it, which is the smallest number of fixed vectors $v_i$ you need to generate the whole thing by using linear combinations $\sum_i \alpha_i v_i$, where $\alpha_i$ are scalars (e.g. $\mathbb{R}^n$ has dimension $n$). This number is the most important quantity associated to a vector space, because one can prove that over the same field, if two vector spaces have the same dimension, they are isomorphic. A set of $v_i$ of minimal size that generates the whole space in this way is called a basis.
The sensible sort of map on vector spaces is one that keeps the addition and the scalar multiplication structure intact: $$ L\left(\sum_i \alpha_i u_i\right) = \sum_i \alpha_i L(u_i) $$ for any set of vectors $\{u_i\}$. These are called linear maps.
The definition of vector space covers the spaces we expect, but includes many objects that we would not initially expect to be vectors:
The last three of these are infinite-dimensional, so the previous discussion does not apply completely, but we'll need infinite-dimensional spaces. Worth remembering is that your ordinary basis is defined to generate the space using finite linear combinations, so an infinite-dimensional space can have a very weird (and very big) basis in this sense. (Polynomials aren't so bad because they are finite linear combinations already, so $\{1,x,x^2,\dotsc\}$ is a perfectly reputable basis.)
Inner products and Orthogonality
The easiest product on $\mathbb{R}^3$ to generalise is the dot product $u \cdot v = \sum_{i=1}^3 u_i v_i$. Over a real vector space $V$, an inner product is a function $\langle \cdot,\cdot\rangle : V \times V \to \mathbb{R}$ that is
The ordinary dot product is one of these. The example we shall need to worry about is $$ \langle f,g \rangle = \int_a^b f(x) g(x) w(x) \, dx, $$ where $f,g \in C[a,b]$ and $w$ is a positive function.
A vector space equipped with one of these is called an inner-product space.
The inner product essentially allows us to generalise the idea of angles, so we borrow the terminology from geometry: two vectors $u,v$ are said to be orthogonal if $$ \langle u,v\rangle = 0. $$ A vector is normalised if $\langle u,u\rangle=1$. A set of vectors where each vector is normalised and each pair of vectors is orthogonal is called orthonormal.
In a finite-dimensional inner product space, if I give you any old basis, there is an algorithm (called Gram–Schmidt) that will produce an orthonormal basis.
One final thing: given a vector $u$ and an orthonormal basis $\{v_i\}$, there is a unique way to express $u$ as a linear combination $u=\sum_i \alpha_i v_i$: taking inner products of both sides with $v_j$ gives $$ \langle v_j,u \rangle =\sum_i \alpha_i \langle v_i,v_j \rangle = \alpha_j, $$ using linearity of the inner product and orthonormality of the $v_i$. So $$ u = \sum_i \langle v_i,u \rangle v_i. $$
This is where we have to take a short deviation into infinite-dimensional spaces and analysis, where things become rather more complicated.
Hilbert space
A Hilbert space is a complete inner-product space. Complete technically means that Cauchy sequences converge, but you can think of it as a space where every convergent sequence converges to something still in the space. Given an inner-product space, we can make a Hilbert space by adding the appropriate limits in. Here, convergence means that $v_n \to v$ if $\langle v_n-v,v_n-v \rangle \to 0$ (this is either called convergence in mean square or strong convergence, but it doesn't really matter since it's the only sort we consider).
Any finite-dimensional inner-product space is a Hilbert space (no converging needs to happen), but this is not true in infinite dimensions. $C[a,b]$ is not a Hilbert space (one can construct continuous functions that converge to a function with a jump in it, for example), but we can turn it into a Hilbert space using the inner product given above; this new space is called $L^2[a,b]$, the space of square-integrable functions on $[a,b]$. Note that convergence is not pointwise: functions in $L^2[a,b]$ "look the same" if they differ on a finite set of points (more is true, but we shan't need the fully general statement).
A orthonormal basis in a Hilbert space $H$ is a bit different: we mean an orthonormal set of vectors so that given any element $u$ of $H$, we can approximate $u$ as closely as we like by a finite linear combination of elements of the basis (we say that these finite linear combinations are dense in $H$). In the case of $L^2[0,2\pi]$ with $w=1$, the classic example is Fourier series: any element of $L^2[0,2\pi]$ can be written as $\sum_{n=-\infty}^{\infty} a_n e^{inx}$, so $\{e^{inx}\}$ form an orthonormal basis of $L^2[a,b]$.
Orthogonal Polynomials
With the preliminaries finally out of the way, we can finally move on to what orthogonal polynomials are. Given $L^2[a,b]$ with a $w$ so that polynomials have finite integral (e.g. $1$ on [-1,1], $e^{-x}$ on $[0,\infty)$), polynomials form a dense subspace of $L^2[a,b]$ (this is a consequence of something called the Stone–Weierstrass theorem, but never mind about that). Hence, one might want to approximate functions in $L^2[a,b]$ by polynomials. To do so, one requires an orthonormal set of polynomials, and this is where orthogonal polynomials come in.
One can show that each polynomial is unique up to an overall scaling. How? Apply Gram–Schmidt to the sequence $1,x,x^2,\dotsc$! For example, a simple computation of this kind gives the first few Legendre polynomials, $1,x,(3x^2-1)/2,\dotsc$. For various reasons, the scaling of orthogonal polynomials is not usually chosen so that $\langle P_n,P_n \rangle = 1$. The Legendre polynomials, for example, are normally defined to be scaled so that $P_n(1)=1$.
Gram–Schmidt is rather inefficient when you want a lot of terms. More commonly, orthogonal polynomials are defined by generating functions of one sort or another, which is often where the funny scalings come from.
Given a function in $L^2[a,b]$, we can expand it in terms of orthogonal polynomials in exactly the same way as one employs a Fourier series: it gives an expansion of the function in terms of simpler functions that are orthogonal to one another.
The usefulness of this expansion over the Fourier expansion often lies in the polynomials being solutions to certain differential equations. For example, Legendre polynomials appear entirely naturally when solving Laplace's equation in spherical coordinates, and one can use them to expand an axisymmetric solution to Laplace's equation in what is called the multipole expansion. This ability to expand functions in terms of solutions of a differential equation is a specific case of a more general phenomenon, which goes under the general title of Sturm–Liouville theory. Orthogonal polynomials are a particularly nice example of this.
Perhaps the best way to think of expansion in terms of orthogonal polynomials is that the coefficients $\alpha_i$ in $f(x) - \sum_{i=0}^n \alpha_i P_i(x) = R(x)$, where $R(x)$ is the error in the approximation, is that the $\alpha_i$ are chosen so that for a given $n$, $ \int_a^b (R(x))^2 w(x) \, dx $ is minimised: $\sum_{i=0}^n \alpha_i P_i$ is the best approximation of $f$ (in this sense) by a polynomial of degree at most $n$. The orthogonality makes it easy to derive what the coefficients must be, and it corresponds precisely with the $\alpha_i = \langle f,P_i \rangle / \langle P_i,P_i\rangle$ that we obtain by analogy with both the Fourier series idea and the calculation in finite-dimensional space.
One may also understand $\sum_{i=0}^n \alpha_i P_i$ as the projection of $f$ onto the finite-dimensional space spanned by $P_0,P_1,\dotsc,P_n$; we note the analogy with orthogonal projection of a point onto a plane in three dimensions, which also gives the closest point in the plane to the original point.