What is ogive? Use of ogive

2.5k Views Asked by At

What is ogive? I don't know what is ogive.in my book of mathematics it came but there is no explanation abiut it.

2

There are 2 best solutions below

0
On

https://en.wikipedia.org/wiki/Ogive_(statistics)

In statistics, an ogive is a free-hand graph showing the curve of a cumulative distribution function.[1] The points plotted are the upper class limit and the corresponding cumulative frequency.[2] (which, for the normal distribution, resembles one side of an Arabesque or ogival arch). The term can also be used to refer to the empirical cumulative distribution function.

0
On

If you have a sufficiently large random sample from a (continuous) population, it is often useful to try to estimate the PDF (density) function or its CDF. Before the computer age PDFs were often approximated by histograms and CDFs by ogives.

Typically, both kinds of plots are based on grouped or binned data: frequency counts of individual intervals (or bins) for histograms and cumulative frequency counts for ogives.

Nowadays even for very large datasets, computers make it possible to make plots that more carefully take into account the individual observations. So kernel density estimators (KDEs) sometimes replace histograms and empirical cumulative distribution functions (ECDFs) usually replace ogives.

[An ECDF is a step function that uses sorted data, jumping by $1/n$ at each value; and by $k/n$ at a particular value if data are rounded so that $k$ observations are tied at that value. In some fields of application, the term 'ogive' is still used--instead of 'ECDF' -- even when data are not sorted into intervals. A KDE splices curves together to make a 'spline' that approximates the density function.)

Below are the KDE and ECDF of a random sample of size $n = 5000$ from the distribution $\mathsf{Norm}(\mu = 100,\, \sigma=15).$ For reference, the exact density function and CDF are plotted (dotted red). An ogive using the same bins as the histogram would be a broken line very closely approximating the ECDF. In an actual application, the exact PDF and CDF would not be known.

enter image description here

Note: The figure was make using R statistical software. The code is provided below.

set.seed(1218);  n = 5000;  mu = 100;  sg = 15
x = rnorm(n, mu, sg)
par(mfrow=c(1,2))  # enables 2 panels per figure
  HDRH = "Histogram, KDE, and Density of Sample from NORM(100,15)"
   hist(x, prob=T, col="skyblue2", ylim=c(0,.03), main = HDRH)
    lines(density(x), lwd=2, col="blue")
    curve(dnorm(x, mu, sg), add=T, lwd=2, col="red", lty="dotted")
 HDRC = "ECDF and CDF of sample from NORM(100,15)"
   plot(ecdf(x), col="blue", main = HDRC)
    curve(pnorm(x, mu, sg), add=T, lwd=2, col="red", lty="dotted")
par(mfrow=c(1,1))

In order to show more detail, at the loss of some precision of estimation, we show the corresponding figure for the first 1000 of the 5000 normal values sampled above. Some information is lost in binning, so histograms do not have the same accuracy as ECDFs.

enter image description here

Finally, we show the relatively poor estimates from only the first $50$ observations. Here 'rugs' of tick marks below the horizontal axes of the histogram and the ECDF show exact values of the $50$ observations. Also, the the ogive (9-segment broken cyan line), based on the histogram bins, is superimposed on the ECDF plot.

enter image description here

Coordinates for the ogive are shown in the table below:

Endpt    x     y
    0   50  0.00
    1   60  0.02
    2   70  0.04
    3   80  0.10
    4   90  0.28
    5  100  0.48
    6  110  0.80
    7  120  0.90
    8  130  0.98
    9  140  1.00