I'm working on point process analysis and I'm attempting to fit an Inverse Gaussian distribution to a series of inter-event intervals.
My goal is to find the global minimum of the following function
$$ \mathcal{L}(u_{t-n:t}|k,\mathbf{\eta}_t,\theta_t) = -\sum_{i=1}^{n}\eta_i\log p({u}_i|k,\theta_t)\\ $$
by moving in the space of the scaling factor $k$ and the AR coefficients $\theta$.
Here $\eta_i$ are just fixed weighting coefficients, $u_i$ are the target inter-event intervals and $\mu(H_{u_{i-1}},\theta_t)$ is our estimate of the first moment of the IG distribution computed as an AR model
$$ p(u_i|k,\theta_t) = \left[\frac{k}{2\pi\cdot{u_i}^3}\right]^{1/2}e^{-\frac{1}{2}\frac{k\left(u_i-\mu(H_{u_{i-1}},\theta_t)\right)^2}{\mu(H_{u_{i-1}},\theta_t)^2\cdot u_i}} $$
$$ \mu(H_{u_{i}},\theta_t)=\theta_0+\sum_{j=1}^{p}\theta_ju_{i-j+1} $$
By looking at the hessian we can see that we can have both positive and negative values, i.e. the function is not convex:
$$ \frac{\partial^2}{\partial^2 k}\mathcal{L}(u_{t-n:t}|k,\mathbf{\eta}_t,\theta_t) = { \sum_{i=1}^{n} \frac{\eta_i}{2}k^{-2} } $$
$$ \frac{\partial}{\partial k\partial\theta_{j}}\mathcal{L}(u_{t-n:t}|k,\mathbf{\eta}_t,\theta_t) = { \begin{cases} -\sum_{i=1}^{n} {\eta_i} \cdot \frac{ \left(u_i-\mu_i\right)(u_{i-j}) } {\mu_i^3} \ \ \ \ \ \ \ \ j\neq0\\ -\sum_{i=1}^{n} {\eta_i} \cdot \frac{ \left(u_i-\mu_i\right)(1) } {\mu_i^3} \ \ \ \ \ \ \ \ \ \ \ j=0 \end{cases}} $$
$$ \frac{\partial}{\partial \theta_j \partial \theta_q}\mathcal{L}(u_{t-n:t}|k,\mathbf{\eta}_t,\theta_t) = { \begin{cases} \sum_{i=1}^{n}\eta_i{k}(u_{i-j})(u_{i-q})\cdot \left( \frac{ 3\cdot u_i - 2 \cdot \mu_i } {\mu_i^4} \right) \ \ \ \ \ \ j\neq0,q\neq0 \\ \sum_{i=1}^{n}\eta_i{k}(1)(u_{i-q})\cdot \left( \frac{ 3\cdot u_i - 2 \cdot \mu_i } {\mu_i^4} \right) \ \ \ \ \ \ \ \ \ \ \ j=0,q\neq0 \\ \sum_{i=1}^{n}\eta_i{k}(u_{i-j})(1)\cdot \left( \frac{ 3\cdot u_i - 2 \cdot \mu_i } {\mu_i^4} \right) \ \ \ \ \ \ \ \ \ \ \ j\neq0,q=0 \\ \sum_{i=1}^{n}\eta_i{k}(1)(1)\cdot \left( \frac{ 3\cdot u_i - 2 \cdot \mu_i } {\mu_i^4} \right) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ j=0,q=0 \end{cases} } $$
What would be then a good optimisation algorithm which would help me find the global optimum? Would it be ok to use gradient-descent given a decent starting point?