I am having confusion between two methods of calculating the first quartiles.
Let me explain by an example:
$A = [1,2,3,4,5]$
The method that I know:
To find the first quartile in the list above I first find the median = $3$ that is the 3rd element.
Now I split the list in the following fashion:
$(1,2)$
$3$
$(4,5)$
Now we take the list $(1,2)$ and we find the media between them that is $(1+2)/2 = 1.5$.
So according to my calculation the first quartile $Q_1 = 1.5$.
The nearest rank method(Found here)
$ n = \lceil\frac{P}{100} \times N\rceil$
So for the above list the $0.25$ percentile or the first quartile $Q_1$ will be
$\lceil \frac{25}{100} \times {5} \rceil = 2$
which is 2nd position.
So which is correct $1.5$ or $2$ ? Or does chosing any of them is fine ?
If I am correct in my original calculation why am I getting a difference between the 2 methods.
This different from this question since my question involves in understanding which method is more correct.
According to this page and many other pages, the first quartile is defined as the middle number of the lower half of the data set.
However, the meaning of 'lower half' is ambiguous. This is because in the situation where there is an odd number of data points, dividing by half will give a number which is not an integer. Therefore, one method excludes the median (a set of $\lfloor{\frac{n}{2}}\rfloor$ numbers), and another method includes the median (a set of $\lceil{\frac{n}{2}}\rceil$ numbers).
From what I have seen, the first method is by far the most common method, so what you are doing is not wrong. This is because the method taught in the classroom teaches from a more theoretical perspectives, and so the data sets are small to facilitate computation.
Method 2 seems to be more useful in applied mathematics, such as when you have a large dataset. In this case, there is a finer distinction between the $25$th percentile and the $26$th percentile, so in this case calculating the exact $25$th percentile would give a different answer.
And of course, when there are an even number of data points, there is no ambiguity, so both methods give the same result.