How can I calculte the probability of $X$ with a Generlized Hyperbolic Distribution?

98 Views Asked by At

I would like to know how to calculate the probability of $X$ when I have fitted a Generalized Hyperbolic Distribution to my data set.

The depth of my knowledge is basic t-tests and z-tests. I am developing something in R and have followed the correct steps however don't quite understand the mathematics behind testing a value ($X$) once I know the correct distribution.

Could someone explain how I can do this please?

If it's an arduous explanation just point me to some relevant material.

Thanks, William

3

There are 3 best solutions below

0
On

You can simply fit your data using one of the fit function in the ghpy package, demonstrated using random data, here:

a_hyp_model<-fit.ghypuv(1/(1+abs(rnorm(100,0,1))))

And then you can use this to generate random observations following your "fitted" distribution (and plot it with hist):

hist(rghyp(500,a_hyp_model))

For the other standard distribution functions, see ?rghyp.

0
On

Thanks for that! I've had a little play and it seems the same as I have produced in the modelling process (if I've misinterpreted please point it out). In the code below you'll see that I am comparing the distributions and performing a likelihood test at the end. The answer for hyp fit comes back as cannot accept null. Which leads to my conclusion that a hyp would be a good fit (correct me if I'm wrong). Know that I know this I want to know what the P-Value for a Spread of > 0.0001 occurring is. I'm confused as to how I would compute this, or whether there is a function that performs the required task?

I'll provide some more detail;

library (ghyp)
library (timeSeries)

    # Coverting to Time Series 
    E <- timeSeries(A[,"Spread"])

    # Fitting
    ef <- (density(E))
    ghdfit <- fit.ghypuv(E,symmetric = FALSE, control = list(maxit = 1000))
    hypfit <- fit.hypuv(E,symmetric = FALSE, control = list(maxit = 1000))
    nigfit <- fit.NIGuv(E,symmetric = FALSE, control = list(maxit = 1000))

    # Density
    ghddens <- dghyp(ef$x, ghdfit)
  hypdens <- dghyp(ef$x, hypfit)
    nigdens <- dghyp(ef$x, nigfit)
  nordens <- dnorm(ef$x, mean = mean(E),sd = sd(c(E[,1])))
    col.def <- c("black","red","green","orange")
    plot(ef, xlab = " Spread ", ylab = expression(f(x)),ylim = c(0,50), main ='CABLE - 3 Day Comparison across 28 Years')
    lines(ef$x, ghddens, col = "red")
  lines(ef$x, hypdens, col = "blue")
    lines(ef$x, nigdens, col = "green")
  lines(ef$x, nordens, col = "orange")
    legend("topleft", legend = c("Empirical","GHD","HYP","NIG","NORM"), col = col.def, lty = 1)

3 Day Cable Test

    # QQ Plot
    qqghyp(ghdfit, line = TRUE, ghyp.col = "red", plot.legend = FALSE, gaussian = FALSE, main = " ", cex = 0.8)
    qqghyp(hypfit, add = TRUE, ghyp.pch = 2, ghyp.col = "green", gaussian = FALSE, line = FALSE, cex = 0.8)
    qqghyp(nigfit,add = TRUE, ghyp.pch = 3, ghyp.col = "orange", gaussian = FALSE, line = FALSE, cex = 0.8)
    legend("topleft", legend = c("GHD","HYP","NIG"), col = col.def[-c(1,5)], pch = 1:3)

QQ of 3 Day

    # Diagnostic
    options(scipen=999)
    AIC <- stepAIC.ghyp(E, dist = c("ghyp","hyp","NIG"), symmetric = FALSE, control = list(maxit=1000))
    LRghdnig <- lik.ratio.test(ghdfit,nigfit)
    LRghdhyp <- lik.ratio.test(ghdfit,hypfit)

LRghdhyp $statistic L 0.07005745

$p.value 1 0.0211198

$df 1 1

$H0 1 FALSE

So, I know the correct distribution and how to fit it. How do I go about determining the probability of '> 0.0001' occurs?

0
On

Answer is posted here and involves the pghyp function @fg nu