What exactly is meant by a contrast function in this context? How does it differ from the objective function? Why is it that several contrast functions can work here?
2026-03-31 17:53:37.1774979617
What is the purpose of the contrast function in independent component analysis, specifically fast ICA?
976 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in STATISTICS
- Given is $2$ dimensional random variable $(X,Y)$ with table. Determine the correlation between $X$ and $Y$
- Statistics based on empirical distribution
- Given $U,V \sim R(0,1)$. Determine covariance between $X = UV$ and $V$
- Fisher information of sufficient statistic
- Solving Equation with Euler's Number
- derive the expectation of exponential function $e^{-\left\Vert \mathbf{x} - V\mathbf{x}+\mathbf{a}\right\Vert^2}$ or its upper bound
- Determine the marginal distributions of $(T_1, T_2)$
- KL divergence between two multivariate Bernoulli distribution
- Given random variables $(T_1,T_2)$. Show that $T_1$ and $T_2$ are independent and exponentially distributed if..
- Probability of tossing marbles,covariance
Related Questions in OPTIMIZATION
- Optimization - If the sum of objective functions are similar, will sum of argmax's be similar
- optimization with strict inequality of variables
- Gradient of Cost Function To Find Matrix Factorization
- Calculation of distance of a point from a curve
- Find all local maxima and minima of $x^2+y^2$ subject to the constraint $x^2+2y=6$. Does $x^2+y^2$ have a global max/min on the same constraint?
- What does it mean to dualize a constraint in the context of Lagrangian relaxation?
- Modified conjugate gradient method to minimise quadratic functional restricted to positive solutions
- Building the model for a Linear Programming Problem
- Maximize the function
- Transform LMI problem into different SDP form
Related Questions in NONLINEAR-OPTIMIZATION
- Prove that Newton's Method is invariant under invertible linear transformations
- set points in 2D interval with optimality condition
- Finding a mixture of 1st and 0'th order Markov models that is closest to an empirical distribution
- Sufficient condition for strict minimality in infinite-dimensional spaces
- Weak convergence under linear operators
- Solving special (simple?) system of polynomial equations (only up to second degree)
- Smallest distance to point where objective function value meets a given threshold
- KKT Condition and Global Optimal
- What is the purpose of an oracle in optimization?
- Prove that any Nonlinear program can be written in the form...
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
I would say that in this context, the contrast function is a certain type of objective function that measures the difference between IC and Gaussian distribution. And that there are so many possible contrast functions because the only requirement to the function is to have a local extremum whenever the ICs distribution is max. non-Gaussian and when super and sub-Gaussian ICs have to be distinguished, that their sign differs.
But, let's prove this with some theory:
As you probably know, the basic idea of ICA is to isolate variables (sources) that are assumed to be i.i.d. from known observations (which are thought to be a linear combination of the unknown i.i.d. sources).
Mathematically, the observations can be written as $\mathbf{x}=(x_1, x_2, ..., x_n)$ and the i.i.d source variables $s_i$ as vector $\mathbf{s}$. Further, $\mathbf{A}$ is a full-rank matrix that contains the mixing information. Then: \begin{equation} \mathbf{x} = \mathbf{A s} \end{equation}
Please note that here the combination of $\mathbf{A}$ and $\mathbf{s}$ has to be estimated by a single known matrix $\mathbf{x}$. This problem is solvable through an iterative and a symmetric approach, but both have in common, that a measure (objective function) is needed that rates the independence of the sources.
Exact measures are mutual information and negentropy and the article Minimization of Mutual Information makes clear why non-Gaussianity is equally suitable. (Gaussian distribution has maximum entropy)
This means, that the independence of the sources is measurable by the difference (contrast) of each variable's distribution and the Gaussian. The negentropy itself is quite difficult to calculate, as it requires finding all the density functions and so it is reasonable to take a higher moment like the kurtosis as measure.
Keep in mind, that the negentropy is positive for all distributions and 0 only for a Gaussian while the kurtosis is 0 for Gaussian distributions, negative for sub-Gaussians and positive for super-Gaussians. Therefore, the square of the kurtosis is the correct approximation (under the constraint of zero-mean and unit variance):
\begin{equation} kurt^2(\textbf{w}^T \textbf{x}) = (E\{\textbf{w}^T \textbf{x}\}^4 - 3)^2 \end{equation}
Indeed, Hyvärinen could show in one of his papers that the found objective function reaches a local maximum, exactly when the local combination equals one of the ICs: $\textbf{w}^T \textbf{x} = \pm s_i$
Knowing these requirements, Hyvärinen took a look onto arbitrary functions and could prove that the extrema of practically any non-quadratic well-behaving even function G coincide with independent components and can be used as non-linearity with the contrast function (here $v$ is a standardized Gaussian). \begin{equation} J_G(w) = [E_x\{G(w^T x)\} - E_v\{G(v)\}]^2 \end{equation}
Having this whole set of possible contrast functions, it is another thing, to choose the most optimal one, but simply said it has to be robust to outliers and have a low asymptotic variance. Practice has proven that $G(u) = log cosh(a_1 u), a_1 > 1$ is good for most use-cases. $G(u) = - \exp(-a_2 u^2 / 2)$ works fine for super-Gaussians and $G(u) = u^4$ may be used for sub-Gaussian ICs when very few outliers are present (because $u^4$ is very sensitive to large u) .
And back to the definitions: Hyvärinen refers to the non-linear function G as contrast function and sometimes he calls the whole objective function as the same.
For me, it is reasonable to call:
G non-linear / mapping function and
J the objective function which can also be called contrast function because it measures the contrast to Gaussian distributions.