I have a set of points $(x_i,y_i),\ i=1,\ldots,n$ and want to find a quadratic function $f(x) = ax^2 + bx + c$, at the end I need to find $a,b,c$, which has shortest distances to the set of points.
So, the condition is to find $a, b, c$ which do minimize the function
$P(a,b,c,x_i)=\sum_{i=1}^n(d^2_i)$,
where $d_i^2=(\hat x_i -x_i)^2 + (\hat y_i -y_i)^2$ is the distance from point $(x_i,y_i)$ to the parabola $f(x)$.
and point $(\hat x_i, \hat y_i)$ is the point on the parabola $f(x)$, which has the shortest distance to the point $(x_i,y_i)$
I could constitute formulas for $a,b,c$ for OLS(ordinary least squares) dependent only from $x_i, y_i$ because I had to find a minimum on function $P(a,b,c)$, so it was enough to take a derivative on $a$, $b$, and $c$ and so solve system of three equation with three unknown.
For TLS(total least squares) regression, I think the same is orthogonal regression see Deming regression, the function $P(a,b,c)$ does also depend on unknown $\hat x_i$, which is the $x$ coordinate of the point on the parabola nearest to the point$(x_i,y_i)$.
In the internet I could unfortunately only find formulas for TLS regression for the line $f(x)=ax+b$, but not for parabola.
At the moment I stuck to get the formulas for $a,b,c$ dependent only on $(x_i,y_i)$
What are the formulas (like $a=\sum(x^2_i)+ \sum(y^2_i)$ etc.) and how can I find them.
To understand clearer what I want, here are the formulas for $a$, $b$ and $c$ using OLS regression(they can definitely be simplified, but I had too few time, all $\sum$ are $\sum_{i=1}^n$):
$$ \bbox[5px,border:2px solid darkblue]
{
\mathbf b = \frac{n \sum (x^2y) - \sum x^2 \sum y + (\sum x^2)^2(M-L) - n(M-L)\sum x^4}{n\sum x^3} / (1 - \frac{E(\sum x^2)^2 - \sum x^2 \sum x}{n\sum X^3} + \frac{E\sum x^4}{\sum x^3})
}$$
$$ \bbox[5px,border:2px solid darkblue]
{
\mathbf a = b*E + M - L
}$$
$$ \bbox[5px,border:2px solid darkblue]
{
\mathbf c = \frac{\sum y - a \sum x^2 - b \sum x}{n}
}$$
$E = \frac{(\sum x^2)^2 -n \sum x^2}{n\sum x^3 - \sum x^2 \sum x}$
$M = \frac{n\sum (xy)}{n\sum x^3 - \sum x^2 \sum x}$
$L = \frac{\sum x \sum y}{n\sum x^3 - \sum x^2 \sum x}$

$\color{brown}{\textbf{Calculations of the distance.}}$
The square of the distance from the point $\;(X,Y)\;$ of the given set $(X_i,Y_i)$ to the parabola $\,(a,s,v) = a(x-s)^2+v = \pm z^2+v\;$ is $$d^2 = \min\limits_{x\in\mathbb R} \delta(x),$$ where $\;x\;$ is the abscissa of the arbitrary point on the parabola, $$z = \sqrt{|a|}\,(x-s),\quad Z =\sqrt{|a|}\,(X-s).\tag1$$ $$\delta(x) = f(z) = (z-Z)^2+(z^2\pm(v-Y))^2,\tag2$$ $$\dfrac12f'(z) = z-Z+2z(z^2\pm(v-Y)),\tag3$$ If to denote abscissa of the optimal point on the parabola as $\;\hat x,\;$ and $\;\hat z= \sqrt{|a|}\,(\hat x-s),\;$ then $$\begin{cases} 2\hat z^3+(1\pm2(v-Y))\hat z=Z\\[4pt] d^2= (\hat z-Z)^2+(\hat z^2\pm(v-Y))^2. \end{cases}\tag4$$ From $(4)$ should $$d^2\in\big[0,(Z^2\pm(v-Y)^2)^2\big],\tag5$$ $$2d^2 = 2(\hat z- Z)^2 +Z\hat z+2(v-Y)^2\big(2\hat z^2\pm(v-Y))^2-(1\pm2(v-Y))\big)\hat z^2,$$ $$2d^2 = (1\mp2(v-Y)+4(v-Y)^2)\hat z^2 -3 Z\hat z+2Z^2\pm2(v-Y)^4,$$ $$d^2 = p\left(\hat z - \dfrac{3Z}{2p}\right)^2+\dfrac{8p-9}{8p}\,Z^2\pm(v-Y)^4,\tag{6.1}$$ where $$p = 1\mp2(v-Y)+4(v-Y)^2\tag{6.2}$$ Formulas $(5)-(6)$ can simplify the calculations.
Let $$r=\sqrt{\dfrac{|2\pm 4(v-Y)|}{3}}.\tag7$$
If $\;\mathbf{1\pm2(v-Y) \ge 0},\;$ then $$4\hat z^3+3r^2\hat z = 2Z,$$ $$\dfrac{2Z}{r^3} = 4\left(\dfrac{\hat z}r\right)^3 + 3\dfrac{\hat z}r =\sinh\left(3\operatorname{arcsinh}\dfrac{\hat z}r\right),$$ $$\hat z = r\sinh\left(\dfrac13\operatorname{arcsinh}\dfrac{2Z}{r^3}\right).\tag{8.1}$$
If $\;\mathbf{1\pm2(v-Y) \le 0},\;$ then $$4\hat z^2-3r^2\hat z = 2Z,$$ $$\dfrac{2Z}{r^3} = 4\left(\dfrac{\hat z}r\right)^3 - 3\dfrac{\hat z}r =-\sin\left(3\arcsin\dfrac{\hat z}r\right) =\cosh\left(3\operatorname{arccosh}\dfrac{\hat z}r\right),$$ $$\hat z = \begin{cases} -r\sin\left(\dfrac13\arcsin\dfrac{2Z}{r^3}\right),\quad\text{if}\quad 2|Z|\le r^3\\[4pt] r\cosh\left(\dfrac13\operatorname{arccosh}\dfrac{2Z}{r^3}\right),\quad\text{if}\quad 2|Z|\ge r^3 \end{cases}\tag{8.2}$$ On the other hand, the cubic equations $$\hat z^3\pm\dfrac34\,r^2\hat z -\dfrac Z2 = 0$$ have the discriminants $$D = \dfrac{Z^2}{16}\pm\dfrac{r^6}{64},\tag9$$ wherein
$\color{brown}{\textbf{How does it work?}}$
If $$\binom{X_i}{Y_i}=\left\{\dbinom11,\dbinom22, \dbinom34, \dbinom48, \dbinom5{13}\right\},$$ $$y=\dfrac12x^2+v,$$
then for $\;Z_i=\sqrt2\,X_i\;$ formulas $(8.1)$ allow to get quite realistic plot $\;\hat z_i(v) = \sqrt2 \hat X_i\;$
If $$\binom{X_i}{Y_i}=\left\{\dbinom1{15},\dbinom2{14}, \dbinom3{12}, \dbinom48, \dbinom5{3}\right\},$$ $$y=-\dfrac12x^2+v,$$
then for $\;Z_i=\sqrt2\,X_i\;$ formulas $(10.2)$ over the complex numbers allow to get quite understandable plots $\;\hat z_i(v) = \sqrt2 \hat X_i\;$ with $\;k=0, -1, 1\;$
To get the parabola parameters, it suffices to use the method of the volumes. Every elementary volume should provide the next conditions:
Under these conditions, the constant expression of the sums of the squares of the distances should be defined in the each elementary volume of $\;(a,v,s).\;$
Obtained formulas allow to minimize the sums of the distances squares in the every elementary volume (gradient descent recommended) and then to provide the least of them.