I need some advice as to how to update a matrix of triplets in a likelihood function.
DEFINITIONS
$ \beta =$ scalar scaling parameter
$ \alpha = $ scalar shape parameter
$ X = $ Nx3 matrix of triples, each a point in 3D space ( so the $i^{th}$ row of X is $ x_i , y_i , z_i $ )
$ d_{ij} = $ distance between $ (x_i , y_i , z_i) $ and $ (x_j , y_j , z_j) $, $j \ge i$
$r_{ij} = (1 \times N)$ vector with $i^{th}$ entry equal to 1, $j^{th}$ entry equal to -1, and 0 everywhere else. $j \ge i$
$c_{ij} =$ constants, $j \ge i$
AIM
My aim is to maximize the following function with respect to {$\alpha, \beta, X$}:
$$ L(d,\alpha,\beta | c) = \alpha \sum_{ij} c_{ij} \log{d_{ij}} + \log{\beta}\sum_{ij} c_{ij} - \beta \sum_{ij} d_{ij}^{\alpha} $$
A BIT OF ALGEBRA
Using the equation $ d_{ij} = \sqrt{ (x_i - x_j)^2 + ( y_i - y_j )^2 + (z_i - z_j)^2 } $, I can rewrite the equation in terms of the X matrix.
There are $n(n-1)/2$ observations ($c_{ij}$s) and $n(n-1)/2 + 2$ parameters ($d_{ij}$'s,$\alpha,\beta$). So converting to triplets reduces the dimension of the parameter space (as n is usually in the hundreds of thousands, if not millions) to $3n+2$.
$d_{ij}^2 = (x_i - x_j)^2 + ( y_i - y_j )^2 + (z_i - z_j)^2 = Tr(X^{'}r_{ij}r_{ij}^{'}X) = Tr(r_{ij}^{'}XX^{'}r_{ij}) = r_{ij}^{'}XX^{'}r_{ij} $
So, my equation becomes
$$ L(X,\alpha,\beta | c) = \frac{\alpha}{2} \sum_{ij} c_{ij} \log{(r_{ij}^{'}XX^{'}r_{ij})} + \log{\beta}\sum_{ij} c_{ij} - \beta \sum_{ij} (r_{ij}^{'}XX^{'}r_{ij})^{\frac{\alpha}{2}} $$
HOW I'VE APPROACHED IT
I know that that are an infinite number of possible translated and rotated X's, but we can add constraints to X, such as
$x_1>0$, $y_1=z_1=0$ (i.e. the first point is always on the positive x-axis)
$y_2 > 0$, $z_2=0$ (second point is on the x,y plane, and cross prod of first and second point is positive)
$\sum x_i = \sum y_i = \sum z_i = 0$
$d_{ij} \le |j-i|$ $\forall i<j$
These constraints also increase our degrees to estimate the parameters.
I thought of iteratively cycling through {$\beta,\alpha,X$} and maximizing w.r.t each.
Taking the derivative w.r.t. $\beta$ and setting to zero gives
$$ \beta^{(t+1)} = \frac{\sum_{ij}c_{ij}}{\sum_{ij}(r_{ij}^{'}XX^{'}r_{ij})^{ \alpha^{(t)}/2} } $$
Taking the derivative w.r.t. $\alpha$ gives:
$$ \sum_{ij} c_{ij}\log{(r_{ij}^{'}XX^{'}r_{ij})} - \frac{\beta}{2} \sum_{ij} (r_{ij}^{'}XX^{'}r_{ij})^{\alpha/2}\log{(r_{ij}^{'}XX^{'}r_{ij})} $$
I could either set this to zero and find a solution for $\alpha$ numerically, or use this expression in a Newton-Raphson update for $\alpha$.
$$\alpha^{(t+1)} = \alpha^{(t)} + \frac{ L(\alpha^{(t)}) }{ L'(\alpha^{(t)}) }$$
WHERE I HIT A DEAD END
I would next like to find an iterative update to the entire X matrix and then repeat the above process.
Using Matrix differentiation $\frac{d}{dX'}a'XX'b = X'(ab'+ba')$ ==> $\frac{d}{dX'}r'XX'r = 2X'rr$' (feel free to check my work) I get:
$$\frac{d}{dX'}L(X,\alpha,\beta | c) = \alpha \sum_{ij} \frac{ c_{ij} }{ r_{ij}^{'}XX^{'}r_{ij} } (X^{'}r_{ij}r_{ij}^{'}) - \alpha\beta \sum_{ij} (r_{ij}^{'}XX'r_{ij})^{ \frac{\alpha-2}{2} }(X^{'}r_{ij}r_{ij}^{'})$$
$$=\alpha X' \left[\sum_{ij} \frac{ r_{ij}r_{ij}^{'} }{ r_{ij}^{'}XX^{'}r_{ij} } \left[c_{ij} - \beta (r_{ij}^{'}XX'r_{ij})^{ \frac{\alpha}{2} }\right] \right]$$
The above expression is a $3 \times N$ matrix.
This is non-linear, so I cannot set to zero solve for X directly.
I cannot use Newton-Raphson to update $X^{(t+1)}$ because I cannot divide $L(X,\alpha,\beta | c)$ by $L'(X,\alpha,\beta | c)$
I'm not sure how else I would solve it numerically.
My colleagues have suggested "interior point methods" my be useful. If you would, please explain to me how interior point methods would work in this context, or please suggest some other reasonable approach.
Thanks!