You have a collection of 2d points that you want to fit to a circle. Form the sum of the squares of the distances from the points to a generic circle. The variables are the $x,y$ coordinates of the center, and the radius. Set the gradient to 0, then one equation gives the radius as the average distance from the center to the points. For a given point $C$, let $r(C)$ denote the average distance from $C$ to the points. Using the other two equations, we find the center is a fixed point of the mapping $T(C)$, defined in the following way:
From each point, travel towards $C$ a distance $r(C)$, and then average all of these shifted points to get $T(C)$.
The mapping is not well defined at the points. It is possible in some arrangements of points and some starting point $C_0$ for the fixed point iteration $C_{n+1}=T(C_n)$ to land on one of the points to be fit. With this in mind, I'm trying to find a set on which $T$ is a contraction. Intuitively, we could take disks centered at each point to be fit, and grow them uniformly until they all intersect in a nonempty set, maybe let the disks grow a little more, and assuming the intersection doesn't contain any of the points to be fit, this should serve as a good candidate set to prove $T$ is a contraction. Indeed, it is easy to see that the shifted points will remain in their respective disks about each point to be fit, but it is not at all clear that $T(C)$, the average of the shifted points, will lie in every disk, but it has to be close! Note that the intersection of disks is a compact and convex set, and $T$ is continuous, so I have Brouwer fixed point theorem in mind.
Any ideas how to choose the Set? Is the intersection I described good? I just can't figure out a proof. I have tested the problem fairly thoroughly on a computer and the convergence of the fixed point iteration seems is quite reliable choosing $C_0$ as the average of the points.
In general, I do not expect uniqueness, for suppose there are only 1 or 2 distinct points, then there are infinitely many circles fitting the points exactly. Also note that if the points are distributed on a small arc of the circle, the average of the points is quite far from the center, yet convergence is observed, albeit quite slowly.
Edit I was asked for formulas, so I give $T(C) = \frac{1}{n}\sum_{i=1}^n (P_i + r(C)\frac{C-P_i}{|C-P_i|})$ where $r(C) = \frac{1}{n}\sum_{i=1}^n |C-P_i|$. And we want to find $C$ such that $T(C)=C$. I just need a set that $T$ maps into itself.
Probably not an answer to the question but too long for a comment.
Supposing that you have $n$ data points $(x_i,y_i)$ and you want the "best" circle of radius $r$ centered at $(a,b)$, you need to minimize $$SSQ(a,b,r)=\sum_{i=1}^n \left(r-\sqrt{(x_i-a)^2+(y_i-b)^2} \right)^2$$ which is an highly nonlinear problem. Any minimization method would do it provided goo estimates.
To get such estimates, in a preliminary step, you could write $$f_i=(x_i-a)^2+(y_i-b)^2-r^2$$ and now consider all the $\frac {n(n-1)}2$ equations $$g_{ij}=f_j-f_i=2a(x_i-x_j)+2b(y_i-y_j)+\left((x_j^2-x_i^2)+(y_j^2-y_i^2)\right)$$ A linear regression (or matrix calculation) based on these $\frac {n(n-1)}2$ data points will provide $a$ and $b$. Using them would give as an estimate $$r=\frac 1n \sum_{i=1}^n \sqrt{(x_i-a)^2+(y_i-b)^2}\tag 1$$ Now, you have all elements for starting the optimization. You could even use Newton-Raphson method to solve $$\frac{\partial SSQ}{\partial a}=\frac{\partial SSQ}{\partial b}=\frac{\partial SSQ}{\partial r}=0$$ which reduce to $$\frac{\partial SSQ}{\partial a}=\sum_{i=1}^n \frac{(x_i-a) \left(r-\sqrt{(x_i-a)^2+(y_i-b)^2}\right)}{\sqrt{(x_i-a)^2+(y_i-b)^2}}=0\tag 2$$ $$\frac{\partial SSQ}{\partial b}=\sum_{i=1}^n \frac{(y_i-b) \left(r-\sqrt{(x_i-a)^2+(y_i-b)^2}\right)}{\sqrt{(x_i-a)^2+(y_i-b)^2}}=0\tag 3$$ $$\frac{\partial SSQ}{\partial r}=\sum_{i=1}^n \left(r-\sqrt{(x_i-a)^2+(y_i-b)^2}\right)=0\tag 4$$ The solution of $(4)$ is already given by $(1)$; so, you are left with two nonlinear equations in $(a,b)$.
If you are as lazy as I am, use numerical derivatives.