Finding the Intersection of Normal CDF and y=x and Plot the Relationship Between $\sigma$ and Intersection Conditional on Different $\mu$

520 Views Asked by At

I'm trying to investigate the intersections of the Normal CDF and the $y=x$ line for $\mu \in [0, 1]$ and $ \sigma \in (0, 1] $ when $ x \in [0, 1] $ and plot the corresponding relationship between $\sigma$ and the corresponding $y$ values of the point of intersection conditional on different values of $\mu$'s (for example, for $ \mu = 0, 0.1, 0.2, \ldots , 1$).

First, one need a method to find the intersection of the Normal CDF and the y=x line. For example, for a specific combination of $ \mu = 0.1 $ and $ \sigma = 0.2 $, I try to use the following R code to find the intersection:

epsilon = 0.00000001 x = 0 while (pnorm(x, mean = 0.1, sd = 0.2) - x > epsilon) { x = pnorm(x, mean = 0.1, sd = 0.2) } y = x

Since the objective of this method is to iteratively compute an approximation of the $y$ value of the intersection, epsilon specifies the precision of the approximation. I wonder if anyone may have a better method for this.

After we establish a method to calculate the $y$ value of the point of intersection, the next objective is to repeatedly find $y$ values of intersection points corresponding to different values of $\sigma$ in $(0, 1]$, conditional on $ \mu = 0, 0.1, 0.2, \ldots, 1 $. The desired plotting result should be 11 curves each representing the relationship between values of $\sigma$ and values of $y$ of corresponding intersection points, conditional on the value of $\mu$.

Say I want to evaluate the corresponding $y$ value for each of the $101$ values of $\sigma $ equally distanced in $(0, 1]$, and then connecting the dots with a line plot. Once the method for finding the intersection point is established, I am thinking about creating two matrices of size $101×11$, one for storing the values of $\sigma$ and the other for the values of corresponding $y$, for the 11 possible values of $\mu = 0, 0.1, 0.2, ..., 1$.

To achieve the above objective, one can use multiple loops to generate the desired values before plotting. However, R is slower with loops, so I wonder if someone may offer a better solution for this problem.

So, to reiterate, the problems are: first, are there better methods in finding values of intersection points. Second, how to plot the desire result as described efficiently using R (or other software such as Matlab or Mathematica). Suggestion for other possible visualizations would also be appreciated.

I'd appreciate any help toward solving this problem. Thank you.

1

There are 1 best solutions below

3
On BEST ANSWER

Let's first address the question of finding solutions to the equation $$\Pr[X \le x] = x, \quad 0 \le x \le 1,$$ for $X \sim \operatorname{NormalDistribution}(\mu,\sigma^2)$, or equivalently, $$\Phi\left(\frac{x - \mu}{\sigma}\right) = x$$ where $\Phi$ is the CDF of the standard normal distribution. This of course has no closed form solution for $x$, but with relatively little trouble we can implement a crude Newton-Raphson recursion relation $$x_{n+1} = x_n - \frac{\Phi((x_n - \mu)/\sigma) - x_n}{f_X(x_n) - 1} = \frac{x_n f_X(x_n) - \Phi((x_n - \mu)/\sigma)}{f_X(x_n) - 1} \quad n = 1, 2, \ldots,$$ where $f_X$ is the probability density of $X$. Then the matter of finding roots corresponds to the issue of making appropriate initial guesses $x_0$. We note that the number of such roots is determined by $\mu$ and $\sigma$, but there is always at least one root and at most three. Thus, I would recommend initial guesses $x_0 \in \{\epsilon, 1-\epsilon, 1/2\}$. In R, this can be programmed using the built-in functions pnorm for $\Phi$ and dnorm for $f_X$. There is, however, the matter of the fractal boundary between the basins of attraction, so such initial choices are not guaranteed to find all roots under all $\mu$, $\sigma$ choices.

However, the second part of your question seems to suggest you want a level curve of $(x,\sigma)$ for which $\Pr[X \le x] = x$ is true for a given $\mu$. In this case, it is not necessary to compute the above iteration, because instead of solving for $x$, we choose $x$ and $\mu$ and compute the $\sigma$ that satisfies the desired relationship. Explicitly, $$\sigma = \frac{x - \mu}{\Phi^{-1}(x)},$$ where $\Phi^{-1}$ is the quantile function of the standard normal distribution. This is implemented in R as qnorm. A plot in Mathematica for $\mu = i/10$, $i = 0, 1, \ldots, 10$, with $x$ on the horizontal axis and $\sigma$ on the vertical, is depicted below:

enter image description here

An expanded set of curves for $\mu \in [-3,3]$ is shown as follows: enter image description here