distance between histogram and density in R

102 Views Asked by At

I have some financial data, here I have the histogram and density with estimated parameters regarding normal distribution. What I want now is to determine the "distance" between the data of the histogram and the density.

I can't get to implement that right using $integrate$ in R. Maybe you have an idea?

I have searched for similar problems, but could not find an answer I appreciate your help.

Here is my code for that part, with $Rt$ being data.

1

There are 1 best solutions below

1
On BEST ANSWER

Here's a script that may help:

set.seed(0)
data <- rnorm(100)

hist_f <- function(x) {
  # Depends on global variable `data`.
  
  histogram <- hist(data, plot=F)
  
  # Return zero outside the support of histogram.
  if (x <= min(histogram$breaks) | x > max(histogram$breaks)) {
    return(0)
  }
  # Index of break point just to the left of x.
  break_index <- max(which(histogram$breaks < x))
  return(histogram$density[break_index])
}

# Vectorize function
hist_vf <- Vectorize(hist_f)

# Define
estimated_density_f <- function(x) {
  # Depends on global variable `data`.
  return(dnorm(x, mean(data), sd(data)))
}

# Plot to roughly verify correctness.
grid <- seq(-4, 4, length.out = 200)
hist(data, freq=F)
points(grid, hist_vf(grid))
points(grid, estimated_density_f(grid))

# Define absolute difference
abs_diff <- function(x) {
  # Depends on global functions `hist_vf` and `estimated_density_f`.
  return(abs(hist_vf(x) - estimated_density_f(x)))
}

# Apply `integrate`.
integrate(abs_diff, -Inf, Inf)

enter image description here

> integrate(abs_diff, -Inf, Inf)
0.2199536 with absolute error < 0.00011