Deriving an extension to the solution for the Non Euclidean Multifacility Location Problem when solution angles are greater than pi/2

65 Views Asked by At

I've been attempting to apply A Globally Convergent Algorithm for Facility Location on a Sphere by G.-L. Xue to a problem with facilities located in all quadrants on a sphere, but am running into problems with the minimizer function [0].

The author defines $x$ from the 3 dimensional sphere $S = \{ x | x \in \mathbb{R}^3, \lVert x\rVert = 1\}$. Existing facilities are defined at $m$ locations in the set $S$ at $a_1,a_2,...,a_m$. There are weights $c_j,j=1,2,...,m$ and all $c_j \gt 0$ for $j=1,2,...,m$.

The author defines the evaluation function as the great circle distance function between $x$ and all facilities (the goal is to minimize this):

$$f\left(x\right)=\sum_{j=1}^{m}{c_j\cos^{-1} \left(a_j^Tx\right)}$$

For historical reasons, my particular application has defined it using $arctan$ instead:

$$f\left(x\right)=\sum_{j=1}^{m}{c_jg\left(a_j, x \right)}$$

where:

$$g(a,x) = \begin{cases} \tan^{-1} \left(\frac{\lVert a \times x \rVert}{a \cdot x}\right) , & \text{if $a \cdot x \gt 0$} \\ \frac{\pi}{2}, & \text{if $a \cdot x = 0$} \\ \pi + \tan^{-1} \left(\frac{\lVert a \times x \rVert}{a \cdot x}\right) , & \text{otherwise} \\ \end{cases}$$

Assuming I've applied $arctan$'s ranges and domains correctly, I don't believe the author uses any special properties of the domain or range of $arccos$ in later derivations, so I believe this paper's derivation still applies. I wanted to mention this for complete transparency.

The author makes adjustments to ensure the evaluation function $f\left( x\right)$ to have its domain defined over $\{ x | x \in \mathbb{R}^3, x \ne 0\}$. This is noted as "convenient" because later they make projections from $S$ to the plane tangent to $S$ at a point $x$, in order to change the problem to one of a Euclidean Multifacility Location.

The paper defines two minimizer functions due to non-differentiability at the points of existing facilities (a non-smooth answer) or any point not at $a_j$ for $j=1,2,...,m$ (smooth). The functions that produce vectors that minimize $f(x)$ for further iterations are:

$$d=-\sum_{j=1,j\ne t}^{m}{c_j\frac{a_t-a_j^{a_t}}{\lVert a_t-a_j^{a_t} \rVert}}$$

for non-smooth, and for smooth:

$$d=-\sum_{j=1}^{m}{c_j\frac{x^k-a_j^{x^k}}{\lVert x^k-a_j^{x^k} \rVert}}$$

I believe conceptually that $d$ is supposed to be a new vector that results in iteratively converging towards the solution. However, I have noticed strange behaviors in $d$ when angles are $\ge\frac{\pi}{2}$ for...

  • ...the nonsmooth case, the angle between $a_j$ and $a_t$
  • ...the smooth case, the angle between $a_j$ and $x^k$

So digging through the derivation, at equation 9 I found the projection defined as:

$$a_j^x=\frac{a_j}{\left(\dfrac{x}{\lVert x \rVert}\right)^T a_j}$$

To be one involving transforming the non-Euclidean problem to one being Euclidean.

My sole question is: is this function $d$ and projection $a_j^x$ indeed well-formed for a set of facilities located at various quadrants of a sphere?

The rest is my perspective based on me, a non-mathematician, trying to learn about and solve this problem.

My hunch is that, conceptually, something is going wrong when projecting facilities that lie at angles $\ge \frac{\pi}{2}$ to the plane that lies tangent to the sphere $S$ at $x$. I am not sure if a local minima is being reached, or the projection is incorrect and resulting in a violation in the convergence property proven elsewhere, or if I have implemented it incorrectly such that it would not be caught as being incorrect when verifying using the paper's sole provided example.

I'm not sure how best to proceed to derive a solution that can handle such angles. I realize multiple global minimized solutions could exist; my only requirement would be to find any single globally minimum solution.

What led me to identify this particular equation was that I could verify my implementation against the paper's Data Set 1 and resulting calculations of $f(x)$ and $\nabla f(x)$. However, I noticed that particular dataset has $a_j$ defined such that $a_{jx} > 0, a_{jy} > 0, a_{jz} > 0$ for all $j=1,2,...,m$.

In lieu of the approach outlined in the paper, I have thought about approximating the solution using:

  • a more brute-force Monte-Carlo approximation
  • iteratively via a Newton's-method-like approach (but in 2 dimensions along the non-Euclidean surface)
  • some a combination of the above two

But these are more computationally expensive, and less exact a result, and perhaps subject to the problem of finding only a local minima.

Thank you for your time. I'm not a mathematician, so I hope I've formatted this question in a way that is understandable. I am happy to make further clarifications as needed.

[0] Xue, G.-L. (1994). A globally convergent algorithm for facility location on a sphere. Computers & Mathematics with Applications, 27(6), 37–50. doi: 10.1016/0898-1221(94)90109-0

1

There are 1 best solutions below

0
On BEST ANSWER

I've answered my own question.

There is probably a more optimal solution, but I found one that verifies against the paper's own dataset, works for orthogonal facilities $a_j$, and works for facilities all over the sphere excepting a single case.

I treated the paper's definition of $a_j^x$ like a Gnomonic projection, where the "far" side is not a part of the domain of the projection for $a_j^x$. I call the "far" side the hemisphere centered on the anti-pole of the point whose plane tangent to the sphere we are projecting to. While it would probably suffice to continue using the projection defined as $a_j^x$ in the paper, and apply a special $\pi$ rotation about the $\hat x$ vector for projections whose points are originating on the "far" side, such a piecewise answer didn't seem excitingly math-enough when investigating this problem.

Instead, I dove into the answer for Stereographic projection when the “North/South Pole” is not given by (0,…,±1)? to try to rigorously map the entire sphere's domain onto the plane that lies tangent to the sphere at an arbitrary point.

I wound up entirely replacing the definition $a_j^x$ in the paper, using that answer's definition of $\varphi$, with:

$$ p = -x \\ a_p = p + \frac{a_j-p}{1- \left(a_j \cdot p \right)} \\ a_j^x = \hat{a_p} g(a_j, x)+x $$

Where $a_j$ is the facility, $x$ is the point on the sphere $S$ whose plane tangent to $S$ is the one we want to project onto (for the minimizer), and $g(a_j, x)$ is the great circle distance function I defined in my original question. A couple notes:

  1. The answer providing $\varphi$, which I use to define $a_p$, uses the anti-pole of $x$. This is because the projection that $a_p$ does puts the hemisphere containing $p$ on the "outside" of the projection. I would like $x$ to be projected on the inside of the plane, to the same point $x$, so I must use $p=-x$ instead to ensure this.
  2. The stereographic projection results in distance distortions that I believe affects the convexity of the minimizer and thus the convergence property. To preserve that, I first observe the projected $a_j$ point, $a_p$, is in the correct direction in Euclidean space. So I take its unit direction and scale by the great circle distance. The final addition of $x$ translates $\hat{a_p} g(a_j, x)$, which is the projection plane but containing the origin, to the desired parallel plane containing $x$.

Thus, the Non-Euclidean Multifacility Location Problem correctly mapped to a Euclidean space with correct distances. This is probably overkill, but it's been a fun and enlightening journey.

The single case that fails, which is left for specific applications, is how to handle projecting a facility $a_j$ if it equals $-x$, which has an undefined Stereographic projection (the denominator $1- \left(a_j \cdot p \right)=1- \left(-x \cdot -x \right)=1 - 1 = 0$).