Notation of inverse functions vs powers of minus one in probit function definition

528 Views Asked by At

The question arose after reading this Wikipedia article.

$\Phi^{-1}(p)$ stands for probit, because it's inverse function of cumulative distribution function, which is traditionally $\Phi^{}(z)$

If I understood it right, it's not the power of minus one, but an inverse function. Thus, if you have a value of $z$, you can obtain the probability with $\Phi(z)$, but if you have the probability first, then you can obtain the $z$ with $\Phi^{-1}(p)$. It has nothing with power function, just a weird notation (for me).

Then I read this: $\operatorname{probit}(p)=\sqrt2\operatorname{erf}^{-1}(2p-1)$

And now I don't know, if this $\operatorname{erf}^{-1}(2p-1)$ notation is $1/\operatorname{erf}(2p-1)$, or there is an inverse $\operatorname{erf}$ function. I haven't seen inverse $\operatorname{erf}$ function before, and there's no much sense in defining $\operatorname{probit}$ in terms of it. But my computations show that $1/\operatorname{erf}(2p-1)$ do not produce right values.

Questions are:

Is there's a general way to distinguish "inverse function" from "power of minus one" notation?

What is the right definition of probit?

1

There are 1 best solutions below

0
On BEST ANSWER

Understanding the Notation

In high school algebra classes, the notation $x^{-1}$ is typically used to denote the reciprocal of $x$. That is, if $x$ is a nonzero real number, then we define $$ x^{-1} = \frac{1}{x}. $$ There are advantages to this notation. For example, it fits right in with the notation for exponentiation: we want $x^{m}x^{n} = x^{m+n}$ for any positive $x$ and any two integers $m$ and $n$. But then $$ 1 = x^{0} = x^{1+(-1)} = x^{1} x^{-1} = x x^{-1} \iff x^{-1} = \frac{1}{x}. $$ This notation is fine for beginners, but sometimes causes confusion for students when they start taking classes where they have to work with functions.

In these slightly more advanced classes, if $f$ is a one-to-one function, then it can be inverted. The notation $f^{-1}$ is used to denote the inverse function. This inverse function is defined by the property that $$ f^{-1}( f( x ) ) = x \qquad\text{and}\qquad f( f^{-1}( y ) ) = y $$ for all $x$ in the domain of $f$, and all $y$ in the range of $f$.

Initially, these two ways of using the notation seem quite distinct. However, they are connected by a common thread. In both cases, there is a structure which allows us to "put things together" in a nice way.

  • In the case of real numbers, $x^{-1}$ denotes the inverse of the real number $x$ with respect to the operation of multiplication. As noted above, we have $$ x \cdot x^{-1} = 1 = x^{-1} \cdot x. $$ Here, $1$ is the multiplicative identity; it has the property that $$ 1 \cdot x = x = x \cdot 1 $$ for any real number $x$.
  • In the case of functions, $f^{-1}$ denotes the inverse of an injective function $f$ with respect to the operation of composition. We have $$ f \circ f^{-1} = \operatorname{id} = f^{-1} \circ f.$$ Here $\operatorname{id}$ is the identity function, i.e. the function which acts according to the formula $\operatorname{id}(x) = x$. (Note that I am abusing notation slightly by not distinguishing between the identity on the domain and the codomain.) Observe that $$ \operatorname{id} \circ f = f = f\circ \operatorname{id} $$ for any function $f$ ($f$ doesn't even have to be invertible for this to make sense).

In short, the notation $\operatorname{thing}^{-1}$ can be used to denote the reciprocal or multiplicative inverse (if thing is a number), or the compositional inverse (if thing is a function). This looks like two different uses of the same notation, but they are related.

If you are confused by the use of the same notation to mean to seemingly different things, remember that mathematicians often overload notation—we only have a finite number of symbols, and a huge number of ideas that we might want to express. Use context to determine what the notation means: if $f$ is a function, then $f^{-1}$ must be the functional inverse—it simply wouldn't make sense for it to denote the reciprocal.

In short—aside from context—there is not general a way of distinguishing "inverse function" from "a number to the power of $-1$".

What is the Correct Definition of the probit Function?

You state that $$ \operatorname{probit}(p)=\sqrt2\operatorname{erf}^{-1}(2p-1). $$ In this case, $\operatorname{erf}$ is a function (the "error function"). In case it matters, it is defined by the integral $$ \operatorname{erf}(x) := \frac{2}{\sqrt{\pi}} \int_{0}^{x} \mathrm{e}^{-t^2}\,\mathrm{d}t. $$ Because $\operatorname{erf}$ is a function, the most reasonable interpretation of $\operatorname{erf}^{-1}$ is that this is the inverse of this function. Indeed, if you continue to read the Wikipedia article linked in the original question, you will find a section on computation of the probit function, in which it is stated:

The normal distribution CDF and its inverse are not available in closed form, and computation requires careful use of numerical procedures. However, the functions are widely available in software for statistics and probability modeling, and in spreadsheets. In Microsoft Excel, for example, the probit function is available as norm.s.inv(p). In computing environments where numerical implementations of the inverse error function are available, the probit function may be obtained as $$\operatorname {probit} (p)={\sqrt {2}}\,\operatorname {erf} ^{-1}(2p-1).$$ An example is MATLAB, where an 'erfinv' function is available. The language Mathematica implements 'InverseErf'.

This confirms the above assertion: $\operatorname{erf}^{-1}$ is, indeed, the inverse error function. This implies that the probit function is defined in terms of this inverse error function, and not in terms of the reciprocal of the error function.