Say I have many discrete measurements taken of the radius of a circle $[R_0,R_N]$. I recently read a standard$^1$ in which the "best center" of the circle could be calculated as a displacement from the measurement axis as follows:
$x = \frac{2}{N}\sum_{i=1}^{N}R_i\cos(\theta_i)$
$y = \frac{2}{N}\sum_{i=1}^{N}R_i\sin(\theta_i)$
Where the $i^{th}$ radius' measurement proceeds at angle $\theta_i$, counterclockwise from the x-axis. The measurement axis to the "best center" is then represented by the vector ${\bf R_{BC}} = x\hat{\imath} + y\hat{\jmath}$.
My question is where do the factors of 2 come from? In the reduced case of two measurement points, where $R_1 = 0\hat{\imath} + 1\hat{\jmath}$, and $R_2 = 0\hat{\imath} - 2\hat{\jmath}$, gives ${\bf R_{BC}} = 0\hat{\imath} - 1\hat{\jmath}$. This is a non-intuitive result for me. I would think it should be ${\bf R_{BC}} = 0\hat{\imath} - 0.5\hat{\jmath}$, the point equidistant from the two measured radii (which is the result that would be obtained without the factor of 2). Has anyone seen something similar (with the factor of 2), or do you think the standard is wrong (see excerpt below for the relevant portion of the standard)?
$^1$The standard: Hydroelectric Turbine-Generator Units Guide for Erection Tolerances and Shaft System Alignment, Part I: Definitions, Section 3.2.9, www.ceati.com
UPDATE
I have since programmed a simulation to empirically investigate this topic further and I am now even more confused. Take the circular shape plotted below:
If the x-axis is taken to be along the 90 degree mark and the y-axis is taken to be along the 0 degree mark, the equations for the x and y components of the best center are slightly modified to be:
$x = \frac{2}{N}\sum_{i=1}^{N}R_i\sin(\theta_i)$
$y = \frac{2}{N}\sum_{i=1}^{N}R_i\cos(\theta_i)$
By symmetry, we know that the x components should cancel, and the best center should be somewhere on the positive y-axis. Moreover, we note that by definition, the best center should minimize circularity (i.e. the difference between the maximum and minimum radius, as referenced from the best center, should be a minimized when this location is truly "best"). The following plot shows the best center at 0.25, where circularity is indeed minimized.
This best center of $y=0.25$ makes intuitive sense, as this is the location that bisects the vertical deformation in the circular shape. The following plot shows the best center of this circular shape, for various number of measurement samples taken at equally spaced angles around the shape. The different lines show different scaling for the $y$ componant of the best center (the $x$ component is 0). The best match is when $K=2.\overline{27}$, which was found empirically. Furthermore, this $K$ value doesn't appear to change for different deformations of the circle. Can anyone tell me why the best scaling for best center is $K=2.\overline{27}$?
UPDATE 2
I developed a less uniformly deformed circular shape for my simulation. Specifically, I created a circular shape with a Gaussian bulge along a portion of its circumference with the following equations:
$-\pi\leq\theta_i<\pi$ over 1000 equal discrete steps.
$R_i(\theta_i)=R_{base}+\frac{A}{\sigma\sqrt{2\pi}}e^{-\theta_i^2/2\sigma^2}$
I then calculated circularity with respect to the best center as determined by varying the scaling $K$. This produces the following plots, for $R_{base}=1.0$, $\sigma=\frac{\pi}{6}$, and $A=0.65644$.
This yielded $K=2.74$, and shows that the optimal $K$ value is dependent on the circular deformity. I then decided to try a more realistic scenario.
Specifically, this whole problem was prompted from a real world application of measuring the circularity of large generator rotors. A typical generator is 30 feet in diameter with a 1 inch air gap around its rotor. The following plots represent the empirical analysis for a 30 foot diameter rotor with a 500 mil (i.e. 500 thousandths of an inch, or half the air gap) deformity. In this case, the parameters of the Gaussian are $R_{base}=180000$ mils, $\sigma=\frac{\pi}{8}$, and $A=492.2$.
These last plots make sense, as the deformity is 500 mils, and the circularity reduces to the circularity with respect to the origin at $K=0$, which should yield a circularity of 500 mils. This is indeed the case. So, it is clear the $K$ depends on how non-uniform the circle is. Can anyone explain this dependence?
UPDATE 3
Added relevant excerpt from standard as image above.










It would appear that the results you're quoting for the circularity only hold under some approximations.
One of the definitions of circularity is via fitting the data to a circle, by optimizing the following error function, which measures deviations from a perfect circle, with respect to it's parameters:
$$E=\sum_{i=1}^{N}(\sqrt{(x_i-a)^2+(y_i-b)^2}-R)^2$$
Indeed, this loss function is zero if all observations fall exactly on a circle of some center $a,b$ and some radius $R$. Minimizing this loss function with respect to $a,b,R$ will give us estimators for the best values of the position of the center and the radius of the circle.
This problem however is non-linear, and thus analytical expressions for the estimators cannot be derived. If we make the assumption that the center of the circle doesn't differ much from the origin $(0,0)$ we can successively approximate
$$\sqrt{(x_i-a)^2+(y_i-b)^2}\approx\sqrt{x_i^2+y_i^2-2(ax_i+by_i)}=r_i\sqrt{1-2\frac{a\cos\theta_i+b\sin\theta_i}{r_i}}\\\approx r_i-a\cos\theta_i-b\sin\theta_i$$
where $r_i^2=x_i^2+y_i^2$.
Applying that to the loss function we recover a linear least squares estimation problem:
$$E=\sum_{i=1}^N(r_i-R-a\cos\theta_i-b\sin\theta_i)^2$$
which we can minimize and obtain the equations
$$\frac{\partial E}{\partial a}=2\sum_i\cos\theta_i(r_i-R-a\cos\theta_i-b\sin\theta_i)=0\\\frac{\partial E}{\partial b}=2\sum_i\sin\theta_i(r_i-R-a\cos\theta_i-b\sin\theta_i)=0\\\frac{\partial E}{\partial R}=2\sum_i(r_i-R-a\cos\theta_i-b\sin\theta_i)=0\\$$
which constitutes a system of linear equations that can be written in the form:
$$\begin{pmatrix}\sum\cos^2\theta_i&\sum\cos\theta_i\sin\theta_i&\sum\cos\theta_i\\\sum\cos\theta_i\sin\theta_i&\sum\sin^2\theta_i&\sum\sin\theta_i\\\sum\cos\theta_i&\sum\sin\theta_i&\sum 1\end{pmatrix}\begin{pmatrix}a\\b\\R\end{pmatrix}=\begin{pmatrix}\sum r_i\cos\theta_i\\\sum r_i\sin\theta_{i}\\\sum r_i\end{pmatrix}$$
This is easy enough to solve explicitly but, if we would like a really simple formula, then we can also assume that the angles at which the measurements are taken are equally spaced (which is a reasonable assumption to make for engineering purposes). In this case we can show that for $\theta_i=\theta_0+\frac{2\pi (i-1)}{N}$ the following identities hold:
$$\sum_i\cos\theta_i=\sum_i\sin\theta_i=\sum_i \cos\theta_i\sin\theta_i=0\\\sum_i{\cos^2\theta_i}=\sum\sin^2\theta_i=\frac{N}{2}$$
and the linear system becomes diagonal and thus
$$a=\frac{2}{N}\sum_ir_i\cos\theta_i~,~b=\frac{2}{N}\sum_ir_i\sin\theta_i~,~r=\frac{1}{N}\sum_i r_i$$
Given this result, there are a few points to be addressed here:
1)This explains why in this confusing thread of simulations you've been finding constantly contradicting results. The factor of $2$ appearing in front of the weighted average for the center of mass is not describing an exact solution, but a solution to an approximated version of the minimization problem. As one moves the true center away from (0,0) more and more, one should find that the textbook estimators provided above are less and less accurate.
2)It is easy to see that the estimator derived above does NOT minimize the minimum circumscribed-maximum inscribed radius recipe, but the loss function shown above. We shouldn't be mixing the two, and each time we estimate roundness we should be referring to the method used to estimate it. (see for example here and here) In many of the above examples, it is not clear which method is used. Not all methods give the same answers, but they do give similar answers for many regular examples.
3)The reason for the constant being $K=2$ and not $K=1$ for example, is explained above. One could obtain a constant of $K=1$ , for example if the following loss function is considered, with angularly equidistributed measurements:
$$E^*=\sum_i(a+(R-R_i)\cos\theta_i)^2+(b+(R-R_i)\sin\theta_i)^2$$
Personally, I don't think that there is any practical value (there always is mathematical value in looking into a non-linear problem, however) in extracting a best fitting value of $K$ for centers that are significantly far away from the origin, since it does not accurately represent corrections to the approximate solutions quoted and it varies a lot with the individual set of observations obtained.